DOCOHENT RESOHE 



ED 110 016 

AOTHOIi ^ 
TITLE 

fNSTITOTION 

SPONS AGENCY 
POB DATE 
NOTE 



EDES PRICE 
DESCRIPTORS 



IDENTIPftERS,. 



Zunde^ Pr 
Scientif i 
rflucation 
Georgia I 
Inf or«ati 
National • 
Jun 75 
159p. ; Fo 
.869 

HF-$0.76 
Annual Re 
F(u:Bation 
Informati 
Inf oraati 
Inforaati 
Instracti 
Science H 
'Concepts; 
♦Project 



IB 002 28a. 

anas . 

o and Technical Information Transfer for 
^STITr) . Pesearph Report No. jj^-^ 
nst. of Tec^i.^ Atlanta.^ School ot 
on and Computer Science. , 
Science Foundation # Washingtdn, D. C. 

V related documents see ED 095 867 through 



HC-$8.2U PLOS POSTAGE 

ports; *Co«puter Programs; Concept 

; Data 'BaWs; *Exiveriaental Programs;- 

on Centers;^^f ormation Dissemination; ^ 

on Retrieval; rn|ormat ion scienti^s; 

on Storage; Inforiation Systems; 

onal Design; Learning; ^Science Instruction; 

aterials; ^Science TeacherS;' Scientifio 

Ose Studies ' - 

STITE ' - , 



ABSTRACT - * - - 

Emphasizing the design of a data base management 
system tor the^ experimental STITE (Scientific and Technical/ ^ 
' Information Transfer for Education) project, this; progress report ' 
details the emerging features of this projected facility. Compiled by 
four STITE researchers, the report ^examines: science information 
communication, learning, dissemination, and the stTUCtureof STITE 
itself. In addition, 76 flowcharts are presented which document the' 
STITE data bise njanagejoeirt system. (DS) 



♦ ♦♦♦♦♦♦♦♦ ♦♦♦j|t 

Documents acquired by E 
materials not available fro 
to obtain the best copy ava 
reproducibility are often e 
of the microfiche and hardc 
via the ERIC Document Repro 
responsible fdr the quality 
supplied by EDES are the be 

4i 4t4t4r4t4t ♦♦♦♦♦♦ ♦♦^t^^t^t^t^t^t^t^t'^'^'^'^'^'^ 



♦♦♦♦♦♦♦♦ ********************************** 

RIC include many informal unpublished * 
m other sources. ERIC makes every effort * 
liable* nevertheless, items of marginal ^* 
ncountered and this affects the qualitj^* 
opy reproductions ERIC makes availaWt%^ * 
duction Service (EDBS) ♦ EDRS is not * 
of the original document. Reproductions * 
St that can be made from the origipaI# * 
♦♦♦♦♦♦ ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦'•I******** 



.1^ 



0^ 



I 



V- 



5^ 



0^ 



/ 



RESEARCH REPORT 



t 



Scientific and technical infor!1ation 

TRANSFER FOR EDUCAtlON / 
(gTITE) , 



9 • 



, Pranas Zunde ' , -^^ 



P^ojectf ITirector 



' ^ eOUCATlON t WELFARE 

NATIONAL tNSTHOTE 0*= 

TMtS DOCJA. fNT HAS FTEN «EPRO 

^ ' oucEO f^jtAc^LV A' KEcrivtrj frow 

5 ^ TMf PERSO'. OR 0-C,ANl2AT(0lv/ORI0tN 

* - lA-TC ' ATlNOiT P'C.S'^ 0^ Vl t'A Oft OPINIONS 

- • JUn6, VyJ J STATED DO SO^T SECES^A.? lY repre 

« ^» SENTOf'f-.C ALNAT OSA iNSTtTUTFO*^ 

■EDUCATION HO vTiON OW POi KY 



GEORGIA INSTITUTE' OF TECHNOLOGY 
' ATLANTA, GEORGIA 



SCIENTIFIC AND TECHNICAL INFORMATION TRANSFER FOR EDUCATION 



> a^TITE) 



Fourth progress report oh research performed at the School of Information • 
and Computer Science, Georgia Institute of Technology, Atlanta, Georgia, 
under National Science Foundation Grant No. GN-36114. 



• Pranas Zunde 
Project Director 



June, 1975 



ERIC 



4 



PREFACE ' 



During the period of research work described it\ this report, /main 
emphasis was placed on the design of a data base management system as 
part of the projected experimental STITE facility. Some of the emerging 
features of this system are described in detail in Part Four of the 
report.^ Paral^ to this development, further studies were made bo^h 
for the* purpose of sharpening certain conceptual prerequisites for the 
design of the STITE interface system and for obtaining further insights 

infb potential user needs and^^equfil^ements. These studies are described 

" ■ ' % . . / ' I 

in Parts One, Two, -and Three. ^ ^ 

* Parti<:ipation in this phase of the study of Dr.^L. J. GallaflEr, 

Senior Research Scientist, Dr. T. C. Ting, Associate Professor 6% 

Information and Computer Science, Dr. Albert N. Badre, Assistant Professor 

of , Information and Comt>uter Science, and >lrs. Dorothy S, Hughes, Research ^ 

* Analyst and Librarian, jis grat^ffUly acknowledged. They appear as, authors 

or co-authors of the portions^of this repoVt to: which 'they made major ^ 

contributions. , . * . » , 



1 




TABLE O F -CONTENTS , « 

^- S , * t! 

PART ONE - ON THE STRUCTUJIED COMMUNICATION OF ' , 

: SCIENTIFIC AND TUCHNICAL CONCEPTS v • • ^ 

' * 4 ^ 

Introduction •'^ ...... * 

»» « 

II, , The Concept, in Effective Instructional Coniinunication 5 

III. Concept Acquisition • • * *^ 

r\r."~ Conditions Imposed By the- Instructional Communication 

of Verbal Concept?b ^ . 

V. Variations in User Characteristics in Terms of 
Instructional Objectives 

. 13 

VI. References • 

PART TWO - SCIENCE INFORMATION TRANSFER FOR LEARNING """^ 

, , , • ' ■ 15 

:roduction - • •• - 

. • ^ - .. . ^ . , ■ 16 

II. Previous User atuaies. ...... 

• ' ' 17 

. III. ■ Goals of the Present Study * 

' ' > . IS 

IV. Method of Investigation 

V. Results ... r » / , 

. . ' - , . 26 

VI. Concl<iaions. . |. ....... f 

VII. References 

PART THREE - ANALYSIS OF THE EDUCATIONAL USE OlT AN INFORMATION CENTER ,. . 28; 

' ^- • 

.. . 2? 
I. Introduction 

^ . - - ,T ' ' ' ' - . . 29 ' 

II. .Personal Inquiry to Users- 

III. The Questionnaire tp GIDC Users 

' IVUIT FOUR - DESIGN OF STITE DATA BASE, MANAGEMENT, SYSTEM . . . . 1 ^0 

; * •■ ■ ' . 41 ■ 

' • I. Introduction . • • • »- • • 

II'; Data Base Elements 



TABLE OF CONTENTS 
(continued) 

III. General Description of the STITE Interface System 
IV. The Physical and Logical Structure of the Files 

The Command Language. . » . \ . ^ 

VI. Thfe Hashing Procedure ' * * * 

VII. Assocdatibn List Structure* . . . . 

VlII. Characteristics of the Linkage-Paging System 
IX. ^Structure of the Programs > . . • . 

APPENDIX • 4, 

A.* ^' Letter to Users of the-. Georgia Information ^ 



' . Dissemination Center ((GIDC) » 



Questionnaire to U^rs of the Georgia 
. " o Information Dissemination Center (GIDC) 



/ • PART ONE ' 

♦ 

\ ON THE STRUCT URED COMMUNICATION' 

— - ' • , • 'J . ^ 

OF SCIENTIgIC AND TECHNICAL. CONCEPTS 



PART ONE 



ON TifE STRUCTURED COM^^UNICATION 
OF S CIENTIFIC AND TECHNICAL CONCEPTS 

V — 

Albert N. Badre 



I. INTRODUCTION 

— — « 

'One of the goals of the STITE project is to analyze and define the 

process by which science learning materials are developed and structured. 

This task would require the intuitive identification and definition of 

the elements of science learning Vhich are used in the instructional 

dommunication of scientific and technological concepts. One of the key 

questions, that might be considered in this context has to do with the 

elements associated with tjie design and Structure of the "course". In 

an 'earlier STITE report, it was suggesteid that the "course", or the ; 

"science-subject space'^ consists of a number of major elements associated 

with modes of presenting the learning material such as -illustrative narra- v 

tlve pKisentations , definitional narrative presentations, problems, examples 
, and exercises, conjectures and "hypotheses (9) . - , / 

The present document aims at analyzing fhe elements of instruction 

in terms of the" process of developing a s'cience-subjert spacfe. This will 

be done by considering two main questions: 

* (1) What are the prerequisite concept-communication units of an 

effective verbal- instructional presentation? 

(2) How can the specification of levels at .which a scientific concept 

is communicated lead to a precise classification of user course-development 

characteristics for the STITE system? 



These, are two que.stions which should not be disregarded in the process 
of developing a course. Accordingly, they must be taken into account in 
any attempt to design an artificial system which is intended to aid the 
instructor in the transfer and utilization of scientific and technical 
information from its present repositories to the instructional milieu. 

The first question is motivated by the recognition that there are 
certain basic factors associated with verbal concept communication which 
are independent of the subject-mat ten or the instructor £er se and which 
impose necessary conditions required by the effort to make the instructional 
delivery effective. The second question follows from the assumption that 
there arl certain goal-conditions associated with user- instructor character- 
istics and imposed on the communication process by the individual instructor 
which can be classified in terms of instructional objectives. 

II, THE CONCEPT IN EFFECTIVE INSTRUCTIONAL COMMUNICATION 

It may be suggested that the basic' unit of instructional .communication 
is the concept. Initially, a concept may be defined as the sum total of 
the intensiohal attributes of the object of reference along with an associated 
set of relations over the attributes. 

Given that the proper management of science concept learning in an 
iitstructilnal setting provides for a generally more adequate attainment of 
the overall (learning objectives, then the problem of developing instructional 
programs such that a learner's concept a.ttainn-.ent is tnaxiniizcd, baccr.cc one 
of fundamental concern for the design of ao instructional system or the - 
design of a system that aids the instructo-r in the processing and structuring 
of information for learning situations. 

V: 

I 

>, 

o ■> . - 



III. CONCEPT ACQUISITION 

It seems appropriate to start by analyzing the concept-unit of 
< information in terms of the processes of acquisition and learning. Concept: 
learning is recognized in terms of two basic processes: Concept formation 
'and conaept attainment ( 3). The process ol concept formation involves, 
the learning of new concepts. It is one* whereby a learner acquires an 
extensional recognition of the concept at hand. That is' to say, when ^a 
learner encounters a totally new stimulus and simultaneously (or subsequently) 
comes into possession (whether through learning or assignment) of the name 
of the concept of wh^ich the newly encountered stimulus is a positive instance, 

then it can be said \:hat he-has "formed" the given concept. 

1 ■ ■ ■ 

Concept attainment, on the other hand, requires an intensional under- 
standing of the' concept to be learned. That is, a learner is said to have 
"attained" a concept, if, after h| has already "f9rmed" it, he is able to 
..identify the same concept in terms of its necessary and sufficient attributes 
or dimensions. The notion of "attainment" reqiiires that the learner, given a 
~ class of instances, be able to disoHminate between thpse which do not belong- 
to the concept and those that do on the basis of his knowledge of the concept'^ 
, necessary and sufficient attributes. His intensional understanding of the 
concept will presumably enable him to eliminate those attributes which are 
irrelevant. 

As an illustration of the above, suppose that in a course on graph theory, 
a learner encounters for the first time an instance of the concept which he is 
told is called "graph". By thus encountering an instance of the concept and 
labelling it "graph", he is said to have formed the concept of "graph". Then, 
once he is able to discriminate "pbjects" which are called "graphs" .from 
those which are not, on the basis o'f the relevancy and irrelevancy of the 

, ' • Xi. 

ERIC \ 

rib 



attributes of the concept, "grapli"». then it can be legit ijpately said that 
he has attained the concept of a *'graph". Moreover, it is important to 
notfe that concepts become more complex as the number ol attributes and 
their values increase. For example, /f one learns th^t there are different;' 
kinds of J graphs, e.g. Eulerian graphs, Hamiltonian graphs, then/it is not , 
only necessary to learn how to distinguish between graphs and/noh-graphs, 
\>\it also to be able to recognize and discriminate between at least two 
sub-categories of graphs. \ 

" — ' » 

Hunt (6 ) suggests that the learning of concepts occurs only at the 

I? j' 

I formation 'stage. That is, the process of concept formation is cotermipo^is . 
with that of\ concept learning. He Views the attainment stage as ,me^ely 
being the stage of identification of ah already learned coftcept. *Thus, 
to Hunt, concept learning occurs at the formation stage where one'>resumabiy 
learns a new concept. Once that concept has been assimilated into- 'the 
learner's cognitive structure, it becomes the object of retrieval and 
recoding. It becomes a problem for semantic memory. 

This distinction* made by Hunt seems not so much to contradict as to ' 
clarify and make more useful , at least from a processing approach, the Bruner 

categorizations. It is appropriate to be able to distinguish in the frame« 

- \ ^ - * * 

work of instructional communication new concepts\ot yet learned^ and concepts 

already learned but which fall under soke other category/ of concept learning, 

e.g. concepts already learned but not mastered, ambi^ous concepts that need 

to be reduced to the relevant meaning, and misconstrued conc^pts^ that require 

re--learning (l). ThiS%:ategor ization, in turn,^ leads to some use'ful generali 

zatlons about concepts learned in" .an instructional setting. * 



:RIC 



8) 



ly. CONDITI ONS IMPOSED BY THE INSTRUCTIONAL COMMUNICATION OF VERBAL CONCEPTS , . 

' There are several conditions which have been empirically shown to be 
associated with an instructional event when such an event invi|J.ves the 
cotnmunicatio.u of verbal concepts (1 )•• Conce^i^^^cguisition iit" ipstructionai 
conmiunication can be viewed as consisting of associating a sW-^t^t tributes 
with the name of a concept. An attribute can be viewed as that - on 

without, which a given entity will no longer be precisely that given ehtity. 
Moreover, concepts can be thought of operationall}^ as the mental abstractions 
of life-experiences in which an , experience is defined as the response to any 
stimulation. This would lead to the concept that learning is directly • 
influenced by the reinforcement given a response. Moreover, the learning 
of a concept is strengthened if the experiences of .the learner in' relatioit 
to the concept are in some respect similajj. The way in which those experiences 
are similar is in a sense an expression of the concept. In tjhis sense, the 
experiences are the positive instances of the concept. Another necessary , 
condition is that a set of negative instances of the concept be generated , 
and stored in the repertoire of the learner's experiences. Still another 
necessary condition is that there be an appropriate sequencing of- positive 
and negative instances. This sequencing is important because the complexity 
of a concept increases as a function of an increase in the number of relevant 
attributes. In addition the number of values that each attribute has may 
increase, (e.g. man; black man; tall black man; tall and strong black man; 
* tall, handsome, and strong black African man). Hence concept attainment 
will necessarily become more difficult. Thus, the sequencing, cognitive 
mapping, and mental organizing of attributes and negative instance^ of a 
concept become highly essentijil. Still another necessary condition is that 
a concept be presented to the learner in as many dif f erent^ Olasses of tasks 



a& possible. -Johnson and Stratton ( 7) report that when, in an experiment, 
they classified a concept, defined it, used it in a sentence, and g)ive it 
synonyms, • thev produced superior ^hesults to chose produced when only one 
of those four in< v^as used to explicate the concept. 

This leads to the description of a concept acquisition model .which 
was developed previously ( l),in order to desci'ibe and use a sjiale for 
verbal concept acquisition. This mddel assumes that tKe concept-upit of 
Instructional communication is a function of th^ various combinations of 
relevant and irrelevant attributes.' 



"Let ?AID denote the name of a concept, and Z.A.I^.D.^e the 
conjunction of its necessary and sufficient attributes ;*:Z .aI I Id! , 
the irrelevant attributes of ZAID; Z.A-.I.D., any set of attributes 
not related to ZAID; and z.a.i.d. another set of necessary and 
sufficient attributes of ZAID, where z.a.i.d. i Z.A.I.D. 

Response categories . Then, if learner's response to what 
is! ZAID is: (a) Z.A.I.D., concept is attained ; (b) hone of the 
above classifications, concept is new; (c) Z.A., Z,.I., Z.D., 
such that if and only if a subpart of the conjunction of necessary 
and sufficient attributes is given, concept is vague; (d) Z.Z.A., 
Z.*A., zIaIdI, zIaIiIdI zIa.I. , such that the response entails 
either^ irrelevant or nonrelevant attributes or both, concept is 
misconstrued ; (e) z.a.i.d., and only z.a.i.d., concept is ambiguous; 
(f) z»a.d. = ambiguous-vague intersection; (g) z.a.Z. = ambiguous- 
misconstrued intersection. (1) . ' * - 



'It seems that when the formation-attainment (learning-identification) 

continuum is considered in terms of its pedagogical significance, two funda- , 

mental assumptions mi^st be kept in mind. 

/ * , 

First, that whenever the learning uT a.cuncepL occurs in an instructional 

setting. It generally is contextually-oriented. ^*or example, it would not be 
useful to introduce the mathematical concept of a- "derivative" into an instruc- 
tional dialogue about the merits, or demerits of peaceful coexistence. As 

" V- X. * ■ ^ / 



ERIC 



Carroll (4 ) suggests, it would not even be useful to bring up surh a' 
concept in» a mathetaatical dialogue in which, for example*; the learner was 
not familiar with the prerequisite concepts of "algebraic function" and 
"slope". Hence, concept learning in pedagogy takes place in a definite 
context. Therefore in developing a course that requires the communication 
of concepts, as wo61d.be the case with courses developed from STITE-type 
information, the instructor woulShave to.util-ize techniques which place 
the concept in the proper context. 

Secondly, it is assumed that in any pedagogical event where a concept 

i 

is to be communicated, the learner's conceptual state of mind immediately 
prior to the communication of the concept falls somewhere on the formation- 
attainment contintjpm or on a variation thereof. This assumption implies 
that in the communicating of concepts, full regard must be given to indivi- 
dual differences. That is to s^y, cognizance must be taken* of the individual's 
level of "mental" relationship or preparedness to the conce^pt at hand. \ 

Both of these assumptions are well underlined by Gagne's (5 ) conditions 
of learning model. Gagne has argued and others have shown that the acquisition 
of a skill (in this cas^, one that requires cognitive processing) depends on 
ahd perhaps contains the knowledge and mastery of other prerequisite skills. 
He classifies skills in learning into eight t^pes of .performance that musf^ 
be mastered from the simple to the more complex levels of cognitive activity. 
Kemp (s) summarizes the eight levels as follows: ^ 

, '/ 

1. Signal learning . ^ 

2. Stimulus-response learning ' 

3. Chaining 

4. Verbal association 



Rir 



11 



5. Multiple discrimination ^ ' • . ; 

6. Concept learning ' . 

7. Principle learning 

8. Probl-em solving , ^ ' 

^ Science teaching or the instructional comminication of scientific 
concepts is more likely to be associated with the last four levels of , 
cognitive skill acquisition. Thus, the learner's cognitive preparedness 
in terms of the last four categories would have to be taken into account 
' as one is organizing and sequencing a course or a "science subject-space". 

V. . VARIATIONS IN USER CHARACTERISTICS IN TERMS OF INSTRUCTIONAi: OBJECTIV ES 

The set of conditions listed above constitute those that are necessary 

to the irstructional' communication of science information and thus impose 

vital constraints on the instructor's activity in developing a science .. t 

•subject space. Because of the claimed universality of these conditions * 

over science information transfer in learning, if becomes advantageous 

to specify them as a set of rules and algorithms with respect to a chosen 

topic and then to program the specifications as part of the STITE system, j" 

. An example would be the development of <i sub-course on the "topic", Euleri^n 

graphs, as presented in an earlier STITE report. (9 ). In this regard the | 

system become^s an aid to the instructor in thati if, in developing a cours^, 

he should violate one of these universal cDndi,tions, the program will 

\- "■ .■ 'I ! ■ 

iTiterrupt with a violation note. By the samK token, there are a certain/ 

/ . I ' 

group of conditions, such'as learning objectives, which the instructor may 

impose on the i;i'structional" e^ent when deve/oping a course. To this extent. 

course-development becomes a collaboraiive effort between STITE and the^' 

instructor. The final f)roduct, the course, will as such' be guided by A 



ERIC 



combination of necessary and optional conditions. The optiofial condition's, 
those associated with ^the ^learning objectives, will be instructor-specif Ic . 

Three categbrles ' of ob^ctlves have been identified In the literature 
( 9\. ThegjB are: (a) the cognitive domain objectives, e.g. naming, .<5olvlng, 

i 

remembering, ^proving; (b)^ the affective dpmain obj^ectives, e.g. enjoying, 
ap4>reciating, respecting, and (c) the psychomotor objectives, e.g. con-, 
structlng, adjusting, manipulating. It Is evident 'that the instructional 
objectives of courses developed with STlTE-typ^ information would primarily 
belong to the cognitive domain. Under the cognitive domain, there are 
several levels at which an Instructor may choose to communicate his topic: 

(1) * He 'Could specify that the communication be at the comprehension. 
level. In which case terms such as "explaining by defining", /'giving examples", 
or "sumi.. rizlng" would apply. ^ 

(2) Another level of Instructional communication would be that of 
application where terms such as using, solving, and proving would bet proper. 

(3) The inst/uctor could choose his objectives at the analysis level 
in which case the terminal behaviors would be ones sueh as inferring and 
relating. ' • , < 

•(4) The instructor could develop his course with the objective of 
achieving synthesis where organising and planning may be thjB desired objectives 

By allowing for flexibility in the level of objectives, STITE can be 
viewed as a collaborative, adaptive system that allows for a wide ^ange of 

4 

user's objective-specifying charactGi;lstlcs. Thus It would cor.bine the 

\ 

necessary conditions that accorripany thfe instructional communication of 

/ 

STITE-type information with the^ flexible ones that accompany the wide 
variety of user-types. 



• 13 - 

•4» 



VI. REEERENCES 

• 

1. Badre, A. H. , and D.' U. Starks. "Concept Learning As a Function of 

Ability and Method." Proceedings of the American PsychologTcaL 
Association 8:669-670 (1973). 

2. " Bloom, Benjamin, ed. Taxonomy of Educational Objectives; the 

Classification of Educational Goals . New York, D. McKay, 
1956. 2 vols. , ■ 

3. "Bruner, J. S.» J. J. Goodnow, and A. A. Austin. Study of T hinking. 

New York, Wiley, 1956. 330' pp. 

4. Carroll, John B. ''Words, Meanings^ and Concepts." Harvard Edu cational 

Review 34 478-202 (1964). 

5. ' Gagne, R. M. The Conditions of Learning . New York, Holt, Rinehart, 

and Winston, 1970^. 407 pp. 

i} 

6. Hunt, E. B. Concept Learning . New York, Wiley, 1962. 286 pp. 

7. Johnson, D. K. , and R. P. Stratton, "Evaluation of^ Five Methods of 

Teaching Concepts." Journal of Educational Psychology 57:48-53 
(1966;. ^ 

8. Kemp, Jerrald E. Instructional Design . Belmont, -California, Fearon 

Publishers, 1971. 130 pp. 

9. Zunde, Pranas. Scientific and Technical Information Tr ansfer for 

Education (STITE). Atlanta, Georgia School of Information and 
Computer Science, Georgia In^titut« of Technology, July, 1973. ^ 
Progress Report, NSF Grant No. GN-36114. 169 *pp. 



■' I 



PART TWO 



•SCIENCE INFORMATION TRANSFER FOR LEARNING 



\ 



ERIC 



15 



PART TWO 

SCIENCE INFORMATION TRANSFER FOR LEARNING * 
Albert N> Badre, Dorothy S, Hughes, T. C. Ting, and Pranas Zunde 



I. INTRODUCTION 

One result of research and development in the field of science infor- 
mation in recent years has been the establishment of large banks of descrip- 
tive information and bibliographic data that is stored on digital and analog 
media. These collections of data, along with the mechanisms for their 
organization, search, and dissemination, comprise science and technical 
Information systems. 

* In the past these centers have been mainly used by industry and research. 
There were indications that the use of such information systems for educa- 
tional purposes was very limited. The results of the present study support 
this generalization. The main objective of the study was to ascertain the 
actual extent of usage of these centers by various categories of science 
educators. It is further hoped that this study will help to determine under* 
what conditions and for what reasons science educators might make more inten- 
sive use of science and technical informatio^iv center^. The work reported 
he^e is part of an ongoing study on ways and means of enhancing the transfer 
oi scientific and technical information from its present repositories to 
science education systems (3). 



* This is the final revised report. An eaflier paper, published ir^ 
the Proceedings of the 1974 ASIS national 'conference, gave prelimi 
results of the study based on partial, or 22.5%, returns. 
The present report is based on a completed total of 33.3% returns o 
2000 questionnaires sent. 

o £0 



16 



PREVIOUS USER STUDIES 



The information explosionhas been accompanied hy an exponential 
growth of studies on information uF^rj and their needs* Hor^^^er^in 
the field of science, the bulk of studies conducted has dealt mainly with 
the information needs of scientists in general and few have compared infor- 
mation needs between disciplines. Among existing studies there are tela- 
.tively few that include findings in the area of science education, and 
^hese"^uaj-ly have been comparisons of information needs based on the 
ai^tiv^ties of ^^^^cientist, i.e. whether he is engaged in research, 
teaching, or industry. Arr-^xampte'-4^ a comparison 'study in which some 
data about science ed.tcatar^ maT^e-^pund'^^ early investigation by 
^Tornudd (7). 

"Even mote, unusual are user studies that are directed specifically 
/ and exi>licit4^ towards. Lhe needs of science educators. Menzel, for 

example, studiedN^ie information exchanging behavior of a group of uni- 

versity teaching scientists (6). His conclusions were made with respect 

\^ 

to a generalized concept of the teaching scientist and, did not take into 
account the relat^'onship between ^v^kIous scientific disciplines and infor- 
mation acquisition behavior. ^ 

Other investigators have taken a broad sub-category of science, such 
as social science, physical science, or engineering, and have attempted 
to define user needs with respect to these broad categories (1) (2) (5)* 
Many of these findings have failed tc delineate infcnaaticn needs that 
are unique to the particular discipline and, furthermore, supporting data 
that would allow the extension of findings to other disciplines was not 
presented. For example, Bartkus concludes that the engineering educator 
Is concerned with the needs, among other, of "his 'research students to 



ERIC 



4 



assure coverage of previous workt" (1) It cannot be claimed that this 
finding is unique to engineering educators but, based on the data 
ccllecte^^l, TO valid generalizations to other disciplines can be r^iad^. 

Also directl> related to the present research arc those studies 
concerned with educator requirements in using information systems • 
Most of these, of which there were few to begin with, have not been 
directly concerned v?i|:h scientific and technical information centers. 
For example, Baughman investigated the Information needs relative to 
the more general category of educational information centers,' while 
Borman and Mittman examined the behavior of users of new information 
systems, but neither study was information center-specif ig, nor disci- 
pline-specific (3) (4). 

II. GOALS Of THE PRESENT STUDY 

While previous studies reveal some useful findings about the 
information needs and acquisition practices of science educators, 
there are two important factors that have received little emphasis. 

• None of the above types of user studies has dealt speci- 

fically with the needs of science teachers who are potential or actual 
users of science and technical information systems; 

(2) All of the above cited studies have dealt with the infor- 
mation needs of educators in general. There has been no attempt to 
Investigate those needs, except in a very limited way, by academic 
disciplines. ^ , 

The present Study was intended to fill this gap to some extent. In 
particular, the goal was to obtain data relative to the following hypotties 

lie 



a) A large percentage of science educators are nor aware of 
the availability or of the existence of science and technical information 
systems. 

b) Most science educators who are aware of the availability 
of science and technical systems have no ready or easy access to them. 

c) Although most academic scientists are not aware of scientific 
and .technical information systems, those who ^access such systems do so for 
research, and not for teaching purposes. 

d) Most of the science educators who have access to such 
systems find that the access tools are inflexible and unsatisfactory to 
use. 

e) Most science educators who access information' from scieirce 

and technical information centers find the information they receive to be of 

> 

little use for their instructional purposes. p 

♦ 

IV. METHOD OF INVESTIGATION 

Since it was important to access a large segment of science educators 
in colleges, it was decided that the most appropriate technique for the 
purposes of this study would be the mailed questionnaire. Accompanying 
each questionnaire was a letter to the potential respondent explaining 
the purposes of the study and the- meaning of scientific and technical 
Information systems. The questionnaire was relatively short; it was 
estimated, on the basis of pre-testing, that a potential respondent would 
* take no more than ten minutes to complete it. 

The sequence of questions was designed to be logically ordered. Jn 
organizing the questions, the respondents were divided into t;wo categories: 
(a) Those who* use science and technical information systems, and (b) Those who do 



not use such systems. This division permitted three categories of questions 
based on the type of respondent. The first category of questions was direct 
only to those who actually use sciencf^ and t^echnical information systeu^s. 
The second category was directed to those who do. not use such systems. The 
last category of questions w'as directed to both types of respondents, users 
and non-users. 

In total, the questionnaire was made up of eleven questions with 
instructions on what to do in order to answer each of the questions.. 

* A stratified-^ random sampling technique was used in selecting apprpxi- 
mately 2,000 science educators from colleges and universities in the United 
States as potential respondents. The names of the educational insfSLtutions 
were obtained Xrom Aji:erican Universities and Colleges , 1973 edition, .arid 
the individual faculty names *vere obtained from general catalogs of the 
colleges and universalities selected. More specifically, the sample was 
A:onstructed as follows: 

!• Within che three major scientific divisions, namely the bio- 
logical, the physical, and the social sciences, both the traditional 
disciplines, as well as inter-^disciplinary programs under each one of 
these broad science categories, were considered. In the biological 
sciences, the following fields we^ selected : biochemistry, botany, 
genetics, microbiology, vhysiology, zoology; in physical sciences: 
aeronautical engineering, astronomy, chemistry, civil engineering, 
electrical engineering, mechanical engineering, statistics, physics; in 
the social sciences: anthropology, economics, library science, psychology, 
sociology, " , • " 



2. Within each field, colleges and universities that offer such a 
discipline were. identified by listings in American Universities and Colleges , • 

. Approxic>ately 108 institurion« were randomly selected from the list, 

3. Between five and ten faculty names were selected from the catelogs 
of the selected institutions relative to the chosen department or program*. 

The selection of individual names was at random. Consec|uently , the proportion 

s 

of faculty, ranks can be expected to reflect the actual composition of the 
fapulty population in the field, A total of approximately 100 faculty member^ ^ 
were selected from each chosen field to receive the questionnaire. 

V. RESULTS 

It was considered that, given a total of 2000 questionnaires, a response 
size of 15% to 20% would be sufficient to allow meaningful interpretation 
of the data* The sample size on which this analysis is based represents a 
33.1% response rate. » ^ 

TABLE I LEVEL OF AWARENESS OF INFORMTION CENTERS 
a. PHYSICAL SCIENCES 

Yes - % No - % 

Aeronautics ' 61.5 38.5 

* Astronomy 45.8 54.2 

Chemistry 50 50 

Civil Engineering 51.6 48.4 

Electrical, Engineering 48.2 51.7 

Geology -O 37.5 62.5 

Mechanical Engineering 52.3 47 

Physics . 23.3 76.6 

Statistics 11.5 88.5 

^ ■ TOTAL 42.6 57.4 



ERIC 



b. BIOLOGICAI^ SCIENCES 
m 



Biology ; 
Botany 
Physiology 
Zoology 



*Yes % 

38.2 
47. 6 
.52.1 . 
38.8 



TOTAL 42.4 



No - % 

61.7 
52.4 

61.2 

^7.6 



,c. SOCIAL SCIENCES 

Anthropology 
Economics 
Library Science 
Psychology 
Sociology' 



21.7' 
8 

43.5 
45.4 
30.8 

TOTAL 27.1. 



78. > 
92 

56.5 
54.5 
69.2 

72.9 



OVERALL TOTAL 



39.2 



60.8 



The findings with respect to the first hypothesis, that most science 
educators are not aware of scientific and technica:^ information centers, are 
outlined in Tables la, lb, and Ic. The results show that in all three cate^ 
goties, the number: of science educators who are not aware of such centers 
is greater than those who are. However, the level of awareness varies over 
the three major science divisions. It can be seen that the physical sciences 
have a much higher level of awareness than the social sciences'. Chemists, 
astronomers, civil engineers, and physiologists seem to be among those who, 
are most aware. These findings, combined with those of Table II seem to 
confirm tihc" assertion that science and technical information centers are 
publicized mainly as services for the research needs of' physical and biological 
sciences. Psychology is the only discipline in the social sciences that seems 



ERiC 



to have a 'high level of awareness. This may be due to the fact that 

psychology already has an existing specialized information service, e.g. 
' Psycliolegic al Abstracts , while the other social sciences are not so well 
• covered. In addition, psychplogy has less defined boundaries, thus' 

allowing its scientists more contact with the other scientific divisions 

than is true of the other social sciences. 



TABLE II RESEARCH VS. TEACHING USAGE OF INFORMATION CENTERS 
a. PHYSICAL SCIENCES 



Research % Teaching % 

Aeronautics 85,7 ^ 14#3 

Astronomy 88.8 11 .1 

Chemistry 68,3 31.7 

Civil Engineering ^ 66,6 . • 33*3 

Electrical Engineering 92,9 7,1 

Geology ' 75 25 

Mechanical Engi^reering 80 20 

^ Physics ^ 100 0 

Statistics ' 100 0 

y 

TOIAL 77 , 23 



BIOLOGICAL SCIENCES 



Biology 
Botany 
Physiology 
Zoology . 



75.9 
83.5 
78.5 
81.3 

TOTAL 79.2 



24.1 
16.7 
21.4 
18.7 

20.8 



c. 



SOCIAL SCIENCES 



Anthropology 
Economics 
Library Science 
Psychology 
Sociology 



71.4 
100 

63.6 
■ 70 

77.7 



TOTAL 71 



28.6 
0 

36.3 
30 

22.2 
29 



ERIC- 



OVERALL TOTAL 



76.8 



23.2 



23 



In order'* to .determine %he |^elatlrf)nship between awareness and use, the 
d^ta in Table III wa* collected. It indicates the relationship between 
those who are aWare of information cerrt:ejr services Hud chose who act!ial3y,. 
use those services. It seems obvious from th€^ results that if the * 
scientist is aware* of information centers, there is a high likelihood that 
he will use their services. 

TABLE III . RglATIONSHIP BET\^FEEN AWARENESS AND USE OVER TH E 
- THREE MAJOR DIVISIONS 

" Yes - % No - % 

\ 

Physical Sciences 78.3 22.7 

Biological Sciences 82.2 ^ 17.8 

Social Sciences v 70 30 

TOTAL 76.9 23.1 

The second }iynot)ies is , that rericly ^^nd easy access to scieiiriCiv: ^An 
technical information centers is not available,^ was tested by eliciting 
informat,ion as to whether sQjentists use those services directly or through 
other channels. As Table IVa, IVb, IVc, and IVd show, approximately 50% 
accessed the services through indirect channels such as libraries. 

It was hypothesized that most^ acad^emic scientists who access infor- 
mation centers use such services for research and not for teaching. The 
results shown in Tables lla, lib, lie and Ild indicate that the primar)^ 

! 

use of information received from centers is for research. This iiras an | 
expected finding since the science and technical information centers \lire 
designed to serve the information needs of research scientists. There were 
no noticeable differences between the three major divisions of science nor 
between individual discipj.ines. 



ERLC 



2A, 



TABLE IV 



DIRECT AND INDIRECT ACCESS TO INFORMAfIOfr€E^^:£RS 



a. PHYSICAL SCIENCES 



Direct, - % 



Aeronautics 


A2.8 


Astronomy 


53.8 


Chemistry 


51.9 


Civil Engineering 


62.5 


Electrical Engineering 


42. 8 


Geology 


0 


Mechanical Engineering 


AA.A 


Physics 


37.5 


Statistics ! 


50 



TOTAL 51 



Other Channels % 



57.1 
46.1 

.A8.1 
37.5 
57.1 

100' 
55.5 
62.5 
50 

A9 



BIOLOGICAL SCIENCES 
Biology 

Botany " ' 

Physiology 

^:oolpgy 



C 

SOCIAL SCIENCES 

Anthropology 
Economics 
Library Science 
Psychology 
Sociology 



57.6 
A1.2 
5A.5 
50 

TOTAL 52 .-9 



d. OVERALL TOTAL 



'33.3 
0 

^71. A 
62.5 
57.1 

TOTAL 55 .\2 
51.5 



A2.3 
58.8' 
A5.A ■ 
50 

47.1 



66.6 
100 
28.5 
37.5 
42.8 

44.8 
48.5 



Tables, V and VI reflect the findings with respecL to tlve fourth 
and fifth hypothesis concerning the levels of satisfaction with the 
> services of information cen.ters and the relationship between the infor- 
mation requested and the information received. It can be seen that 
those who are regularly or often satisfied with the services of informa- 



ERIC 



29 



25 



^' tion centers constitute more than^alf of :he users represented. However, 
it must be noted here that, with respect to the fifth hypothesis, the 
findings reflected in Table VI must be considered in relation to rhe 
'findings in Table II. Since approximately 80% of all users accessed 
centers for research purposes, it becomes clear that the high rate oi 
correspondence between material request and that obtained applied 
primarily to research, and. less to the instructional need^ of the 
scientist. From this finding, it appears that the science educator, 
regardless, of his subject fiiBld,^goes elsewhere for information for 
his instructioniil purposes. 

TABIe V . LEVEL OF SATISFACTION WITH INFORMATION CENTERS 

*Never-%' Sometimes-% Often-% Always-% 

Physical-Science ' 1 24»2 26.4 49.3 

Biological Science 1.6 15/9 25.4 57.1 

Social\Science . 3.5 7 24.1 ' 65.5 

TOTAL 1.6 18 ^ 24.7 55.7 

\ 

TABLE VI CORRESPONDENCE BETWEEN IfflAT IS REQUESTED AND 
^ ■ WHAT IS OBTAINED 



Physical Science 
Biological Science 
Social Science 



Never-% 


Sometimes-% 


Often-% 


Always- 


3.5 


22.1 


34.9 


39.5 


3.3 


11.5 


36.1 


50.1 


0 


.12 


36 


52 


2.9 


16.8 


35 .'5 


44.8 



30 



ERIC 



VI. CONCLUSIONS ■ • . 

The results of the study support the two statements that the primary 
use of information centers is for research and tha? a majority ?f science • 
educators are not aware of the services of information centers. However, 
most of the science educators who are aware of the services use the centers 
frequently and with satisfaction. It may be concluded that exposure of 
information services to science educators is an important factor in the 
improvement of the use of information centers. 



27 



VII. REFERENCES 



1. Bdrtkus, E. P. "Major, Sources of Information for Engineering 

Educators." ^^ngldc^rin g _gdupat_ion, 60:377-380 (January, 1970). 

2. Bath University of Technology. Informati on Requirements o f Reseaich^rs 

in the Social Sciences . Batu, Englaifd, Bath University of 
Technology, 1971. 2 vols. ED054806 and ED054807. 

3. Baughman, Robyn C. Survey of Information Needs of Educational 

Information Specialists . College Park, ^^aryla^d, University 
of Maryland, School of- Library and Information Services, 1972. ^ 
'31 pp. ED068101* , 

4. ' Borman, Lorraine: Mittman, Benjamin. "Interactive Search of BilJlio- 

graphic Data* Bases in an Academic Environment"., Journal of 
the American Society for Information Science , 2:164-171 (May - 
June, 1972). 

5. Brittain, J. M. Information and Its Users; A Review With Special 

References to the Social Sciences . New York, John Wiley, 1971* 
208 pp. 

6. ' Menzel, Herbert. "Planned and Unplanned Scientific Communication." 

Proceedings o f the Intcrnationnj Con ference on Scisntific 
Inforpiat J^on . Vashi ny,t nn , *^«C, , \'ovetT^>^er 16 - 21, IQ^'io . Wrisu- 
ington, D.C. National Academy of Science - National Research Council, 
1959. Vol. I, p. 199-243. . 

>^ 

7. TOrnudd, Elin. "Study of The Use of Scientific Literature and 

Reference Services by Scandinavian Scientists and Engineers 
Engaged in Research and Development." In Proceedings of the 
International Conference on Scientific Information, Washington , 
D.C, November 16 - 21, 1958 . Washington, D.C, National 
Academy of Science - National Research Council, 1959. Vol. I, 
p. 9-65. 

8. Zunde, Pranas. S cientific and Technical Information Transfer for 

Education (STITE) . Atlanta, Georgia, School of Information and 
Computer Science, Georgia Institute of *Trechnology , June, 1973. 
Progress Report, NSF Grant No. GNt36114. 

9. Zunde, Pranas. Scientific and Technical Information Transfer fo!: 

Eduoation (STITE). Atlanta, Georgia, School of Information and 
Computer Science, Georgia Institute of Technology, Decembei 1973.. 
Progress Report, NSF Grant No. GN-36114. 

10. 2unde, .Pranas. Scientific ani Technical Information Transfer for 

Education (STITE). Atlanta, Georgia, School of Information and 
Cohiputer Science, Georgia Institute of Technology, July, 1974. 
Progress Report, NSF Grant No. GN-361t4. - 



ERIC 



PART THREE 



ANALYSIS OF THE EDUCATIONAL USE OF AN INFOMATION CENTER 



29 



PART THREE 

ANALYSIS OF THE EDUCATIONA L USE OF AN INF0R.-1ATIGN CENTER 

I - -- I 111 m m ^m . ^ ii ■ ■ ■ ■ » . - , 

\ Dorothy Hughes 

I. Il^TRODtJCTION 

Results of the survey of the educational needs of science educators 
seemed to indicat^ that further investigation into the instructional usage 
of a specific information center might yield some helpful results, and a 
gtoup of users of the Georgia Informati'^" Dissemination Center was selected 
by STITE for this ihquiry. 

The Georgia Information Dissemination Center, or GIDC, is a 

I 

bibliographic retrieval system serving the faculty, research staff, and 
graduate students of the University System of Gcorgiaf, It be;;an in 1968 
and searches multiple data ba^s^s to provide both SDI and retrospective 
search services to its users. Physioal, facilities are located on the 
university campus at Athens, and remote users from Georgia's eleven ot;her 
senior colleges and twelve junior coireges are usually served through 
reference librarians on those campuses* 

A list of users of the GIDC was obtaine4 from the Center. This 
user list contains approximately A, 000 names, alohg^ with the academic 
department withs^which the user is associated and^is mailing address. 

As a beginning, telephone calls were made to sixty-seven randomly 
selected users at the Georgia Institute of Technology and at Georgia State 
University to inquire if any of the information they had* received from GIDC 



ERLC 



had been used for instructional purposes. When nineteen of those 
professors contacted indicated that such had been the case, then personal 
interview^ were arranged v;ith ririe oE them in five different departments 
of the two schools to explore at greater length the instructional usage 
that these. professors had made of the information they had received. 
In addition, five of the telephone calls developed into interviews when 
the initial call aroused interest and subsequently extended the conversation 
into greater depth, so a total of fourteen personal interviews were conducted 

Each professor was asked in the interview how he became aware of 
the GIDC, the actual educational task that was accomplished with the 
information he received, if the information was directly usable as it was 
received in the print-out, and the features of an information system that 
would make it a better one for instructional purposes. 

Awareness of the GIDC as it was indicated in the interviews came 
through formal and informal channels, with the informal opes seeming to 
be the stronger. In one case a memo and a brochure explaining the service 
was sent to the professor from the campus library, and in another a 
presentation about the service was made to the department by an information 
specialist from the^DC. Two professors indicated that the deans of 
their departments had explained the service to them while two others 
indicated that they learned of it through colleagues. A student informed 
one professor that the GIDC was available, and another overheard a 
conversation about it between the dean of the department and a student. 
Finally, at least one professor just could not remember how he learned 
about it. 

Actual educational tasks that utilized the information received from 
GIDC inclu3ed course preparation, course up-date, preparation and up-date 



31 



of reading lists and bibliographies, development of notes for a course 
for which no text existed, assistance in writing text books, and term 
paper assigviments. 

, In only three ^instances was the information us^d directly from the 
print-out. One professor divides the print-out and distributes portions 
of it to his graduate students with instructions for them to locate, read^ 
and annotate articles. This practice , Reaches the literature of the field 
and results in an annotated bibliography. A chemistry professor binds 
the print-outs in a looseleaf notebook and makes them available to students 
who are working in the area of the search. Another chemistry professor 
selects five or six references from the print-out to make subject lists 
which he then gives to students \J±th instructions to locate and read the 
articles and then develop a term paper. 

In most cases, however, the references from the priuL-uut are 
|etrieved and scanned or read before the information is utilized in any 
way. 

Most of the suggestipns for modifications in the GIDC tl increase 
or improve its use by educators were concerned with practical aspects of 

♦ 

its use rather than with the nature of the information Itself. While 
several professors expressed the desire for the inclusion of abstracts at 
no charge and for a capability to retrieve examples and problems, most of . 
^them stressed simplicity and ease of use through more convenient methods 
of instigating searches, more accessible assistance in profile preparation 
and revision, and changes in the print-out which would increase its 
readability and ease of handling. 



ERIC 



FRir 



32 



III. THE QUESTIONNAIRE TO GIDC USERS 

Following the personal interviews with GIDC users, a further inquiry, 
in the forra' of a questionnaire, was sent to a' larger sample of users. 

Thii questionnaire (See Appendix was designed to detennint>l.at, 
if any, instructional usage the recipient makes of the information he 
receives from GIDC and his suggestions for change in the system that would 
be beneficial to the educator community. 

Testing of the questionnaire was done with six professors on the 
Georgia Tech campus. After some discussion with those recipients and with 
Ms. Margaret K. Park of the GIDC, some revisions were made, and copies' of 
the questionnaire In its final form were mailed with an, explanatory letter 
and a return envelope to 1211 persons on the user list. (See Appetidix A 

and Appendix B) . ' 

Of the 1211 sent, 410, or almost 33%, werfe answered and returned. 
Every respondent did not necessarily answer every question posed, but the 
following summary is based on the answers received. For e^tfample, many 
users did not indicate th&.,s«b5e£t^rea in which the information was used 
{S^^s^i^J^^tr^S^'t^t&s have been made from the answers received, and some 
^reservations about general assumptions must be kept in /aind. 

Of the 410 users who responded, 251, or 61.2%, are apparently actively 
engaged in instructional activities as Instructors^ assistant professors,^ 
associate professors, and professors. (The categoi*^ of "other academic" 
includes librarians, teaching dSsistanLa, research ciSSoci<ite5, directors, 

« 

and information scientists while the non-academic category takes in , 
administrators, project directors, program coordinators, managers, and those 
engaged in research only.) One hundred nineteen users were graduate students 
who may or may not be engaged in teaching. 



33 



One hundred eighty-eight persons, or 45.8%, indicated that the 
material they received from GIDC had been used for some type of -instructional 
purpose, and Question 3 brought out the 'fact that the most conuuon mstri;- lional 
use is the co;iipilation of biblibgraphip.« gr reading lists. This type of use 
was indicated by 137 users, or 33.4%. Utilization for current awareness 
in the subject area of a course was indicated by ,98 respondents (23.9%), 
while 80 (19.5%) said that they used the information for collection of data. 
Updating an existing course was a utilization cited by 62 (15.1%), while 
the tasks of preparation of illustrative examples and development of new 
courses were Indicated by 42 (10.2%) and 34 (8.2%) users respectively. 
The "other" category checked by 34 persons most often referred to directing 
research projiects of students and developing .research papers. Selection of . 
case studies^and preparation of <iuizzes and tests'were less frequent uses. 

The questionnaire was structured in such a way that: actual instructional 
usua^e was followed by indication of the subject field of utilization. (See 
the Questionnaire in Appendix B.) Consequently, a* regards GIDC, the subject 
areas that ar:e mentioned most often would also be the subject areas that 
receive the most instructional usage. Such a conclusion, however, must 
be tenuous in the light of other factors that also affect usage. For 
example, the subject area of education was the largest area indicated with 
53 persons, or 28.1% of those indicating instructional usage of the 
information they received, coming from that fiell The area of educati^ 
sis supported bv a large, well-developed, and apparently well-known data base, 
that of ERIC. (Indeed, a number of respondents recognize the GIDC system 
only in terms of ERIC. As one said, "I do not know what the GIDC is, but 
I have used ERIC", and^another said, "The only service I have ever used 
is ERIC") So the extensive usage in the area of education may simply 



reflect the existence of an adequate data base, rather than any particular 
characteristic of the field itself Hl\ich might be unusually appropriate 
for computer searches. 

Also the departments of education wirhin the university system, 
which the GIDC serves,^ generally have large enrollments and consequently 
more faculty. So the large number of users in a particular subject field 
might simply reflect an aspect of the user population, such as size, rather 
•than an^aspect of the subject field itself. 

After education, biology was the second largest subject mentioned 
(by 33 or 12.^%), and chemistry was next (by 120 persons or 10.6%). Here' 
' again this usage could reflect somewhat the existence of adequate data bases. 
However^ psychology, which was indicated by 18 users, or 9. 5%, as their 
^utilization area, and which was the fourtli-Jhi^hest_subjecJL_^e^^^ 
named under Question 8 py at least four^uccrs who e-:pressed the desire for\< 
better coverage in the area of psychology. 

* Geology, social science, physics, agronomy, pharmacology, biochemistry, 
veterinary medicine, entomology, and microbiology were all mentioned between 
five and eight times each as being the subject of the information utilized, 
whiia a large vatiety of other fields, such as economics, foreign language, 
agriculture, and information sciences, were mentioned once or twice. 

.The questionnaire suggested four different types of courses in which 
• the user might have used the iniurmation he received from GIDC: lecture, 
seminar, special project, and laboratory. The special project seemed to 
be most adaptable to the GIDC information, for it was checked by 52.6% of 
those whp indicated instructional usage. After that came the lecture, with 
40.9%, and then the seminar, with 35.1%. Laboratory usage was indicated 
by 24.^% of the users. The category of "other*' covered such methods as 



workshops, individualized instruction, thesis research, discussion groups, 
graduate research, and research papers. ^ 

\. Most of the educational users (136 or 72.2%) indicated that, for 
their purposes, it was necessary' for them to obtain full-text documei fR , 
rather than to rely on titles and abstracts, although 81 did say that titles 
and abstracts were sufficient. In a few cases there seemed to be some • 
i:onfusion about the availability of abstracts and methods of securing them. 

The last two questions, 7 aftd 8, dealt with suggested improvements 
in the GIDC service that might increasp its instructional] utilization. The- 
original purpose of the questionnaire, to determine the amount and kind of 
usage for instructional purposes t^hat the. Center is receiving, must be kep?*' 
in mind in looking at the responses to these queries. Particularly in 
relationship to suggestions for improvement, important consideration was not- 
given to the possibility, advisability, practicality, or likSLhood of 

■s ■ ■ ' 

implementation; the effort was merely to ascertain what changes the user 
felt would enhance the utilWation of the system for instructional purposes.. 

Question 7, a short list of change's or improvetSents, was answered 
in the following numbers: 

More descriptive abstracts '39.2% or 161 users 

Browsing capability , 29.5% or 121 users 

Interactive system for query or 

profile formulation 27.3% or 112 users 

Easier access to the system 22.4% or 93 users 

Shorter waiting time, for infor- 
mation delivery 22.4% or 92 users 

Question 8 asked for suggestions from the reader for the improvement 

of service, and it is significant that a large measure .of satisfaction 

with the service was indicated by a number of users. Favorable comments. 



10 



such as ''excellent service", "time saver", "most useful" and "very helpful", 
appeared on some of the questionnaires, and while there were a few negative 
comments too, the incluo^on of suggesuionb for improvement should riot 
necessarily be Interpreted as unfavorable criticism. 

The i.' ggestions themselves fall into two general areas, those *havlng 
tS do with data bases and coverages of the service and those concerned with 
the functional details of the operation. 

The most, repeated suggestion involving data bases was for greater 
coverage. Twenty persons, or 4.8%, desired expansion of data ba^s, and 
the areas of the social sciences, the humanities, and psyclj^olqgy were each 
mentioned several times as being areas in which coverage should be expanded. 

The matter of language was a concern for at least fjve respondents 
who requested that the language in which the article appears be* specified 
in the entry and that the language of articles be specified in such a way 
that items can be chosen or rejected on the basis of particular languages. 
Another user asked that less from Russian journals be included. 

' Eight persons requested that more complete information, including 
addresses of authors, be given to facilitate ordering reprints. 

At least five persons suggested more abstracts, although two suggeste 
the use of summaries or annotations instead of abstracts. Another would 
like to acces9 abstracts of dissertations. * 

\ Obtaining materials once they have been cited is a problem for some, 
and suggested solutions included addine directions for securing articles 
to the citation, some indication of the availability of the article in 
local libraries, or including the Library of Congress call number with the 
citation. 




Books, as opposed to journal articles, are sor:atimes not included 
in certain data bases, and at least two users would like to have books, 
thei^ coutsnt-s, and their price-., as a part of the coverage. Another would 
like to se»^ Books in Print available for computer searching. 

One user would like some sample profiles with print-outs to use in 
^struction. Another wondered^ permanent ^iles of subjects f requently 

requested for retrospective searches might be\a time and monpy saver. A 

\ 1 

two year rjecord of on-going research in variouls fields would be an asset 
for still another user, and the great amounts of paper used, with the 
expenditure in time and trees, were a cause of at ast one complaint. 

The suggestion most often made in the area of operational details 
of the GIDC was that the system should have more and better publicity to 
crekte a greater awareness of the .system. At least 25 persons cited this 
need. Perhaps relatc-d to this sugsestion is the fact that 10 respondents 
stated that they are unfamiliar with GIDC ("I don't know what you are 
talking about", "I never heard of GIDC"), although nanjes of all recipients 
of the questionnaire were taken from the GIDC users list. 

The second most frequently made suggestion (from 8 persons) was for 
more education or training for the user in order to give him some under- 
standing of profile preparation, the strengths and limitations of computer 
searches, and hi*-, own responsibility in modifying and adjusting profiles 
to improve their effectiveness. 

The mechanics of profile preparation was the object of a number of 
suggestion^. EighL persons mentioned the need for more specificity in the 
profile. The ability to interact with the system and to do "trial and 
error" searches were menlioned as making for possible improvement in 
profile preparation. One professor would like some sample profiles set 



38 



up to run just for teaching, and another would liHe to be able to run 
a quick, one-time profile for r^pid turn-around. An annual request 
troa the Certer for profile changes and adjustments w^ also suyv^estcd. 

At least 3 users complained of the excessive time required to file 
a profile and to begin receiving some search results. A suggestion for 
improvement in that area was to print out search results locally (i.e., on 
each campus) rather than at a central location as is presently done. 

Five persons felt that easier and faster access to personnel that 
provide service for the Center would be an improvement. Simply getting 
an appointment with the person who assists in profile preparation was a 
problem for, two users, and one user was frustrated because he felt that 
the person assisting him was not sufficiently knowledgable in his sijbject 
to give him the assistance he needed. Relating the service more closely 
to the library and Co the manual searching there waG aiiothcr GUggcpticn, 
and one graduate student felt that one method of improvement would be to 
permanently assign specific subject areas to specific librarians for 
ptof ile^prepar-ation and for retrieval assistance as well. 

Locating information after it has been cited created problems for 

« <, 

some, ^ and the circulation of an ar^nual list of journals searched with an 
indication of their availability in the local library could be a partial 
solution. 

Other suggestions that were mentioned included the desire for a 
consistent search schedule to produce profiles on a regular basis rather 
than erratically, the need of some method to request particular abstracts 
after the first bibliographic profile Has been scanned for relevancy, and 
some way of reducing the amount, of paper used. 



ERLC 



In conclusion, an assessment of .wh-t was learned from the ques- 
tionnaire ^n terms of its intent should made. It seems clear that some 

/ • * 

educators are using the GIDC for *:oti*'^ inbtructi^^nal pu-rposes, primarily 
those tfasks that utilize bibliographic ^xmation. While the suggestions 
that these educators made for improveiueftts: -in the, service of GIDC were 
numerous and yavled in scope, their implementation would not, seem to produc 
any appreciable' increase in the instructional usage of the Center. 




PART FOUR , 

I 



DESIGN OF STITE DATA BASE MANAGEMENT SYSTEM 



■ . ^ 

DESIGN OF STITE DATA BASE MANAGEMINT SYSTEM 
L. J. Gallaher and Pranas Zimde 

I. I NTRODUCTION 

The major functions of the STITE system were broken down into tasks 
and discussed in the second STITE Progress Report (3). Seventeen major 
tasks were identified and analyzed. All of these center around the use of 
the "internal" information, structured as modules and described in the third 
STITE Progress Report (4) . 

The main function of such modular "internal" information is to enhance 
systems applications, particularly in retrieval of "external" information. ♦ 
This information, stored in various scientific and technical information 
centers, consists normally of standard bibliographic data, keywords used to 
index documents, and in some cases, abstracts. Henceforth this information 

» 

will simply be called "external data". 

This report will describe programs which are being developed for STITE 
data base management, including both internally and externally retrieved 
information. In particular, the physical and logical structure of the files, 
* together with a command language" for manipulating the data structures in the 
files, will be discussed. 

II. DATA BASE ELEMtNTS 

Data base elements are records of two kinds: modules of internal 
information and units of external information, such as bibliographic- records . 

The modules have been described in detail earlier. Each consists of 
textual material and additional information about the textual material. 



42 

This information associated with , the text is arranged into fields, with the 
principal topic being designated as A field. B field contains a set of terras 
which are used to describe or to explicate the principal topic, or A field. 
C field indicates the source document from which the textual material is 
taken, F field specifies the form of the text (i.e., English language, chart, 
etc.), H field indicates type of information (i.e., definition, theorem, or 
problem), and D field refers to the level of difficulty of the textual material. 

Modules can be ordered or linked to each other by various relationships 
defined on these fields, and structures can be built with the modules or their 
components. It is these structures or relationships that are of prime interest 
and that will be used to aid in selecting the bibliographic information. 

External data would normally be supplied by an on-line link to various 

« 

data bases, stich as the Dialog (Lockheed) system or the Georgia Information 
Dissemination Center. However, for demonstration purposes, these external 
sources will be simulated by appropriate material that is In fact internal to 
the system since such an on-line link is not economically feasible within 
the scope of the financial resources of the STITE project. 

External data will be processed by the STITE system in two steps, or 
levels. At the first level, external data will be retrie-^ed by queries 
submitted to the external source in the standard format expected by that 
source. At the second level, information retrieved at the first level will 
be processed for the purpost of (1) achieving compatability with the records 
*of the interna] systcni, (2) cxtractrlng additional information, and (3) updating 
lists and file structures. 

At the first level, the main task is to present an intelligent request 
to external sources using the internal information stored in the modules, 



43 



and at the second level, the main task is to cast the records retrieved 
from external sources Into a format approaching as closely as possible the 
format of complete modules. 

III. GENERAL DESCRIPTICN OF THE STITE INTERFACE SYSTEM 

Figures 1 through 5 are diagrams displaying main features of the 
overall STITE system. These diagrams represent the system from the poi> ^ 
of view of the user and emphasize the interaction with the user. 

Information about the system, its capabilities, and the type and extent 
of its information stores is incorporated into the system and will be supplied 
to the user by the system in a dialogue form. 

Explanations to the user about the system are of two kinds. The first 
kind is a general outline of what the system is supposed to be able to do loxa 
the user and how to go about using it. This is the "user's manual" and will 
give an explanation of the details for using the system. It will be structured 
hierarchially so that the user may skip through it to the points of interest 
to him and 'is similar to the standard introductory material needed. for any 
on-line system. 

In addition to the general outline on the use of the system, there tvill 
be a region of more specific information about the subject matter contained 
in the modules. Here the user will be interrogated about his specific interests 
* as regards subject matter and task and an effort will be made to inform him 
of the content of fhf> modules and how he can or cannot be helped bv the system. 
The emphasis here is on subject matter and on what subjects are covered by 
the modules and what are not. 

To carry out certain tasks, especially the open-ended ones, a command 
language will be provided to the user after he arrives at that task. This is^ 



ERIC 



44 




ERIC 






Give brief outline 
of STITE system, as 
requested 









Elaboraue on subject 

information as 
requested, refining 
detail 



Determine subject 
matter and task 



Branch to 
specific task 




Flg-1 



45 



ERIC 




(topic) ; refine 
to subtopic as 
desired 



I 



Retrj.eve BIBL 

entries; 
sort by degree 
of correlation 




Give count of j 

BIBL entries 
retrieved ; give 
degree of correlation 
histogram if requested ; 



^ — 



SL 



Exhibit BIBL entries 
in order of importance, 
one at a time 
until halt requested 




Flg,2 




Save or print 
BIBL entries 
as requested 



Exit or go 



back to start 



To 



i — }>' I 

of Task 5 as, • - \ Task 5 / 



requested 



J. 



( Exit 



Enter 



Task 9 / 



Reaffirm subject 
(topic) ; refine to 
subtopic as desired; 

establish specific 
period (6 mo. , 1 yr. , 
2 yr. , etc.) 



Retrieve and give 
count; give degree 
cf correlation 
histogram if requested 



Sort by degree of 

correlation or 
date as requested 



Exhibit BIBL 


entries as 


requested, j 


oi^e at 


1 

a time | 


1 

until halt requested 1 








To j 







Save or print 
BIBL entries; 
as requested* 



SI 



Exit or go 
back to start 



of Task 9 as 



requested 




Fig. 3 



' . Biter 
Task 14 



Reaffirm task, get 
specific abstract 
identification 



Retrieve and give 
count; give degree 
of correlation 
histogram if requested 



Sort by degree of 
correlation; exhibit 
BIBL entries as requested, 
one at a time 
until halt requested 



I 



Save or print BIBL 
entries as requested 



i u 



Exit or go backS, to 



start of Task 14 as 




requested 



Exit 



\Task 14 J 



ERIC 



Fig. 4 



/ Enter 

• \ 

Tagit 15 ' 

\ ■ ./ 



Reaffirm task: 
Get concept from user 



, Eichibit modules 
explaining^* concept , 
one at a time till end 
or user requests halt 



-i? 



Save or print 
as requested 



\ 



\ 



.1 



Exit or go Back to 
. start of TasU 15 

as requested 

\ 





Fig. 5 

53 



- 49 



not a conversational language but a set of commands that allows him to 

/ 

manipulate, build, and rearrange file structures to suit his needs. This 

command language itself is quite simple and>»rll be interpreted by the system, 

At^ this stage, the following tasks have been considered from the list 

of potential tasks described previously in a STITE Progress Report (3), 

Task No. 5 - Compilation of bibliographies on selected topics. 

Task No. 9 - Compilation of references for a state-of-the-art 
review or of abstracts covering a specified 
time periods 

,-Task No. 14- Retrieval of all abstract^ related to a specific 
abstract. 

Task No. 15- Presentation of concept definitions and - 
explications. 

IV, THE PHYSICAL M DJLOGIC/U.,1^^^^ FILES 

Previous reports have outlined in a general way the organization of 
the automated STITE system. Details of the structure of the information files, 
which will make possible the kinds of action desired, will now be given. 

There are two distinct kinds of files: records of internal information, 
or mudules, and bibliographic data records (BDR) embodying external information, 

A bibliographic ^ata xfecord indicates the author and title of the 
document Mtle, date, and page number for a joufnal article, report number 
and sponsoring agency for a technical report , eventually an abstract of the 
document, a list of keywords which have been used to index the document, and 
the index or abstract journal, with its date and abstract number in which the 
documeAtr appears. 

The information content of each m6dule has been previously described 
I in another STITE report (4). Figure 6 below illustrates typical module 
! content. : ^ ^ ^. 

^ o4 



1* 



ERIC 




629 



' Given any grAph G, an edge-sequence in G is finite 
sequence of edges of the= form 



1 m-1 in f 



It is clear that 



'(also denoted by v^^ 

an edge-sequence has the property that any two conseciitj^ ; 
edges are either adjacent or identical; however, an arbitrary 
sequence of edges of G which has this property is not necessarily 
an edge-sequence. 



A 



edge-sequence 



B 

finite sequence 

edge 

form 

consecutive edges 
adjacent edges 
identical edges 
arbitrary sequence 



C WIL/1/128 
F 1 

HI' ' 
D 1 





Modules 530 and 629 
Fig. 6 



51 



The fundamental unit of a file is the record which, if not specified, 
refers to eithei> a module or a. 6m. The records of both the module arid BDR 
files are organized i,n a similar manner and so will be discussed together. 

A record is considered to have a name and an association list. The 
record name will be an alphanumeric string consisting of a number prefixed 
'by BDR if it is a bibliographic data record or by MOD if it is a module. 
The association list will consist of a collection of pointers designating 
thf elements (properties) or fields of the recovds. 

Each record consists of a collection of fields. Each field has a name, 
value, and association list. The fields are the sub-elements of the records 

'and differ from the records in having values in addition to names* The name 

if 

and valpe of a field are joined in a single alphabetic string with the 
value ;;et off by curly brackets, Field names consist of a one, 

two, or three letter string. For example, the name value for the A field 
of module 629 of Figure 6 would be A {edge-seqaence }, the B field name value 
would be 3 {finite sequence}, B {edge}, B {form^}, etc. 

Most of the records also have free text entries, ^d these are 
referred to as the free text fields and are treated slightly differently. 

Each' field value (except the text) also has an association list. 
Association lists/of the "field values can have such things as the list of all 
records containing that partic\ilar field value (the inversion) or any other 
information deemed relevant or useful. Initially most of these lists are 
null, but the capability of an association list for a field will always 
be present except for the free text field ♦ The free* text fields do not 
have association lists. 

The concept of field is quite general so that it is possible to 

assicn ne\^ fields to records as needed. Also, in discussing the structure 
' 0\> 



of the files, it will be seen that^the distinction between file, record, 
and field is rather artificial and that all "logical" units are treated 
in a unified manner. 

Physically the files are divided into two kinds: alphabetic 
information (called-y-f iles) and pointer information. The alphabetic 
informatin is stored as alpha-numeric, strings with distinct units separated 
by a spec}fel character (low value). The pointer information is stored as 
a collection of binary (machine language) integers. 

Alphabetic files contain all the names, values, and free text 
information. The pointer information shows the form of the association 
lists and indicates such things as the structure of a record or file. 

The pointer information is of two types also: hash references 
and association lists. These are not separated since the structure of 
the two is very similar, as well as the frequency and mode of reference. 

Figure 7 gives a schematic representation of the file structure and 
linkages for a part of record Module 629. Details on the hash file and 
the structure of the association lists will be given later in this chapter. 

Figure 8 represents another view of the file structure. Here the 
files are considered in terms of ownership and are to be made up of permanent 
and temporary working areas. From this point of view, the file owners are 
the system (the system manager) and the users. The system manager can be 
considered as a special user who owns the STITE system files. lu this 
picture 'the BDR files are external to the system and represent the information 
available from the outside andXare "owned" by their supplier. 

The users will have both temporary and permanent files. They will 
also have the capacity of moving one file into the other but with this 



53 




tx3 












< 




> 


•-^ i 




M 










AM 





















0) 








O 








C 








<L> 








D 








cr 








a) 






cr \ 


w 


V 










w . 


0) 




ON 


1 


iJ 


i 


CM 


0) 


•H 








c 


1 






•H 






a) 












j 

I 


o 










Lu 


PQ 













\ 



X 



<1> , 



PQ 



o 



H 
X 
H 



\ 



\ 



ERIC 




Permanent 
Systems 
Files I 



55 



ERIC 



restriction: A user may not move a file into a file not owned himself. 
Thus, only the system manager can modily the permanent system files. 

The main characteristic of the temporary files is that they disappear 
when the user disconnects, while the information in the permanent files is 
kept and is available again at the next use. This ability to create structures 
in temporary files and then to move to permanent areas is an important 
capability and one that will allow the effectiveness of the system or of a 
user to grow with time. ^ 

The distinction between the term "physical" structure and "logical" 
structure in a file system such as this is important and needs clarification. 
This is particularly true when one recognizes that the files are handled 
through an operating system which in turn makes its own distinctions. The 
term "physical file" ib used to mean a iiameJ file, viewed as a distvinct 
entity by the operating system. The actual physical device assigned by 
.the operating system is of no concern. However, in all ca^es, it is 
expected that the physical device is a disk and that the physical records 
are randondy accessible. 

A "logical file" means tjiuse elements (i.e., free t*=»xt, names, 
values, and pointf>rs showing structure) that logically belong together 
whether they reside in the same physical file or not. From this point of 
view there are two kinds of logical files in the system, BDR files and module 
files, with further logical distinction associated with the owner and whether 
the file is permanent or temporary. 

In this way it can be seen that a single logical file (or record) may 
be distributed over several physical files while at the same time a single 
physical file can contain •components of several logical files* From the 



56 



overall view of the operation of the STITE system, the logical file is ' 
the important concept. However, in dealing with the details of how the 
system is maintaining the information, the form of the physical files 
is important. 

V. ThE COMMAND LANGUAGE 

For- carrying out certain tasks and lEor performing other desired 
operations, a command language for manipulating the records and creating 
structures has been implemented. 

Looking at the records or structures being manipulated, one notes that 
they are basically names and sets of namesJT Associated with each name is a 
list called its association list. An association list Xs just a set of names 
(sinrp the order is not velt^vant and no duplicates will be all6wed) . Some 
of the association lists are empty. Sec theoretical operations are defined 
and implemented on these association lists, i.e., on these sets of names. 

The commands are in ,i vefy simple form and restricted to unary and 
binary operations. One of the operands is designated by a P or by pointing 
' register; the other, if there is one, is named in the command. (This is 
reminiscent of a single address machine language with an "accumulator" 
register, except that the "accumulator" register does not hold the item 
to be operated on but only points to it.) The item pointed to by the P or 
pointing register will sometimes be referred to as just P. Thus, the 
command, point X , sets P to X, that is, sets the P or pointer register 
pointing to X. The command at tach Y adds Y to the V list, that 5s, adds 
the name Y to the association list of the name pointed to by the P or 
pointer register. 



ERIC 



57 



This command language is outlined briefly in Table A. 
Some examples of the use of these rommands to perform some elemuntary 
operations follows: ^ 

1) Create Q so that it is the union of X and Y. That is, t!ic 
association list of Q is the union of the association lists of X and Y. 

create Q 
add list X 
add list Y 

2) Remove all items from the association list of R, i.e., make 
R the null list:; 

point R 
delete list R 

or 

create Z 
point R 

delete iibt not Z 
3J Display those items on the association list of X but not on 
the association list of Y: 

create J£MP 
add list X 
delete list Y 
display list 

4) See if X is on the union of Y and Z: 

create T2 
add list Y 
add list Z 
is member X 

Following each command is a response by the system to the user 

giving an indication that the command has been carried out or cannot be 

obeyed. Examples of commands that cannot or will not be Cc^rried out 

> 

are as foUows; 



ERIC 



58 



1) create X Not allowed if the name X already 

exists as a recognizable name 

2) point X Mot allowed If the X dof^s not already 

# exist as a recognizable name 

3) remove X Not allowed if the name X does not 
attach X already exist as a recognizable name, 
add list X or if P register has not yet been set 
etc, 

4) display Not allowed if pointer or P register 
display list has not yet been set to point to a 
count recognizable name 

While this ia a very primitive type command structure for manipulation 
of the association lists (sets), it allows the implementation of the desired 
prcJcesses. Such a simple scheme was chosen for ease and speed of implemen- 
tation without the need for a study of language implementation and compiler 
writing techniques which are not directly among the goals of this project. 

VL THE HASHING PROCEDURE 

'r 

Earlier in this section it was mentioned that the alphabetic 
information was entered and linked by a hashing algorithm (Figure 7 ) • This 
hash algorithm is in three stages. 

First, the alphabetic string is mapped into an integer, ^ 

suitable procedure that tries to give a uniform distribution over some 

» 

interval for the character strings to be encountered. There are many ways 
of doing, this mapping, and nearly any of the traditional methods would be 
satisfactory. (The actual scheme used here was to do exclusive OR's of 
the 7 bit (ASCII) patterns of the characters, taken pairwise. The resulting 
14 bit pattern is taken as a binary integer on the interval 0 to 2^^ -1) * 

Second, this integer, T, is then reduc<|d modulo the, hashing interval, 
M,(by a remainder divide) to give the initial hash position. If this initial 
hash position is empty, the item being hashed is placed there. Otherwise 
gp^i^" proceed to the third step. ^^^^ 



59 



TABLE A ^ 

COMMANDS FOR BUILDING AND MANIPl'LATING ASSOCIATION LISTS 



CREATE X 

POINT X , 
ATTACH X 

REMOVE X 

DISPLAY 
DISPLAY LIST 

ADD LIST X 



DELETE LIST NOT X 



DELETE LIST X 



COUNT 



IS MEMBER X / o 



RELATION LIST X 



MAKE LIST NULL 

ON SECONDARY PUT X 

UNION SECOND)^Y 



Create* the name "X*\ associate the null 
list, set P register pointing to X 

Set P register pointing to X 

Put X on the association list of whatever 
is pointed by P register 

Take X off the association of whatever 
is pointed by P 

Display name pointed by P 

Display the association list of whatever 
is pointed by P 

Add the items, on the X association list 
to the items on the list pointed by P, 
eliminating duplicates (the union or OR 
operation) 

Delete from the list pointed by P all 
items not on the X list (the intersection 
or AND operation) 

Delete from the P list all items on the X 
list (the AND NOT operation) 

Counts the number of items on the pointed 
or P list, 0 if null 

Responds yes if X is on the pointed (P) 
list, no otherwise 

Gives set relation between the pointed 

list and the association list of X. Response: 

''Identical" if X & P lists identical 
"Superset" if X is a proper superset of P 
"Subset" if X is a proper subset of P 
(Count) of intersection of X & P 
"Disjoinr" if X & P lists disjoint 

Sets the P list to null 

X is placed on the lists of the items on 
the list of P 

The list of P is replaced by a list that is 
the union of lists of the items on the 
original list of P . 



60 



TABLE A 
(continued) 



INTERSECT SECONDAIT^ 



ABITERATE 



BAITERATE 



BBITERATE 



ABCONNECT X 



BACONNECT X 



The list of P is replaced by a list, that 
is the intersection of the lists of the 
items on the origin^ list of P • 

The list of P must consist of A field 

name values (i.e., of the form A{ ...}). 

The list of P is replaced as follows: 

1) find the union of all the modules for the 

A fields, 2) find the all B fields of all 

these modules, 3) convert all these B fields 

to A fields (with the name values) • 

The final P list is the set of all the A 

field name-values obtained in step 3, 

Same as ABITERATE except that the A and B 
fields are interchanged. 

Same as ABITERATE except that one starts 
with B fields (and ends with B fields). 

Performs ABITERATE until a match on X is 
obtained; after each iteration the number 
of items in the P list i5 given and the 
option of continuing or stopping is given. 

Same as ABCONNECT except with A and B 
fields interchanged. 



er|c 



61 



'ERIC 



Third, hashing conflicts are resolved by building ai binary tree out 
of the initial hash position. The choice of going to the right or left 
branch of the tree is determined by the odd or evenness of (1/2^] , where j 
is the node level of the tree. The item is put in the first empty position 
encountered. 

Figure 9 gives a pictorial representation of this hashing scheme. 
This algorithm can be expected to give uniform distribution over M and 
relatively well balanced trees if M is prime, small with respect to Imax 
(the maximum of I) and the I^s are distributed uniformly over the interval 
0 to Imax. In this application Imax = 2^ - 1 and M is a program parameter 
normally set around 2^^. The actual computer program is further compli- 
cated by the paging of the links which makes it desirable to., make M the 
product of a nunber of pages Limes the number of links ur^ed per page. 
Thus, instead of M being prime, it is the product of two primes, one of 
which is 41, the n^-imber of links used per page. 

The final position in the hash table or tree of an alphabetic string 
will be relerred to as its internal name, its external name boing the 
• alphabetic string itself. The programs then deal with this internal name 
wherever possible, using the external name only when. communicating with the 
user. Note that once assigned, . the internal name is unique. 

VII, ASSO CIAT ION LIST STRL'CTURK 

The association lists are also trees, being in ruis case ternary trees. 
(Ternary trees were chosen for convenience; this makes the linkage sizes 
for the two groups of linkages, the hash tables and trees, and the association 
lists compatible. Thus, it is convenient to mix these two in the same phjiM^ 
file and buffer areas.) 



THE HASHING TREES FOR ALPHABETIC STRINGS 




Fig. 9 



63 

The algorithm for placing the entries on the tree is ''hash" like: 
<,The name of the object being placed ^'in this case always the internal name/ 
is ma>^d onto the interval 0,1,2 at each node. This mapping is pseudo 
random in\the sense of giving 0,1, auJ 2 with equal Tiequeucy but ii. unique 
and reproducible for each internal nata.-?. Many schemes for doing this are 
satisfactory, and the one used here was chosen for programming convenience. 
It is best described by giving the Algol-like program for performing the 
choice at each step: 

begin 

if t^O then t:= t t 3; 

+ ^Ise begin 

^ j:= 3 + 1; 

t:= (j fl line) V page 

k:= t mod 3 
end 

Here t and j are initialized to zero (at the base of the tree). 
The operator V is the bit wise exclusive OR operation, and mod is the 
remainder divide operation. Page and line are the (unique) page and line | . 
numbers of the internal name being entered on the list, k gjves the branch 1 

\ 

. on the tree at each mode, i»e. k = 0,1, or 2,, \^ 
A few word^ about these "hashed" trees used for the lists as compared 
to sorted trees. First, one notes that the ternary trees are obviously 
faster to access than the corresponding binary trees. That is, the average 
number of looks per item sought is smaller by the ratio log 2/log 3. 
Second, one notes that hashed trees are on the average much better balanced _ 
i;han the sorted trees. In fact, hashed binary trees arc on the averapc about 
' as well balanced as (sorted) AVL trees, yet require no balancing. 



ERIC 



Finally, sorted trees are nortiiall y| binary'; ternary sorted Li.ees 
bein^» unduly complex and less officioMt. \ 

Thus, one concludes that for single^ item look up, the* hashed trees 
are supt^riur to sorted trees and even to balancecf sorted trees. 0^" coursts 
there is a disadvantage in not having the information sorted, and retrieving 
large numbers of consecutive items from a hashe^d tree is- not easy. Furthe*;;^ 
more, the merging of two trees cannot make use of the merge algorithms for 
sorted trees. However, it is not anticipated that either of these two kinds 
of operations (intetval retrieval or mei;g'ing) will be particularly significant 
in the" application of this system. 

VIIL CHARACTHRISTICS OF LIN 10\GE"PAG L NG SYSTEM 

The computer programs^ for ' processing the records are written in the 
C language for the PDP 11 computer. G is an Algol-like language particular]- 
well adapted to the PDP 11 computer and operates under the Unix operating 
system, a time '=^shar^ng, system' for the PDP 11. - , ' 

The PDP 11 configuration in use has 104 'K bytes of core and about 
200 M bytes of disk storage, alt^hough, of -course, not all of this is 
available to a single user. The maximum corp allowed a sin^gle user by the 
Unix system is about 6^Jbytcs. It is anticipated that this project will 
use about 20 M bytes of disk storage. 

Every effort is being made in the programming to use modular and 
hierarchical techniques. The C language is well smited to these methods. 

Because of the limited core the first requirement of the programs 
is for a (data) overlay or paging system, and this is incorporated at the 
lowest leyel. Since it is the linkages that are expected to be referenced 
most frequently, these were incorporated into the paging system. The 



ERLC 



65 



alphabetic information will be needed only, for communication with the 
user and referenced 'is needed and is not involved in the "automatic" 

paging, ' , 

Many of the characteristics of the link-paging system aie determined 
by the machine being used (PDP 11) and the operating system (Unix), 'The 
fixed 'page size of 512 bytes was chosen to fit the hardware/software, A 
link consists of four addresses, each address being 3 byt^s, 2 bytes to 
indicate the page number and .one for the line (or link) within the pag^. 

This allows the ^spanning of 64 K pages of linkage data, the maximum allowed 

-* * ♦ 

to a user (by )Jnix), This also^allows 42 lines (or links) per page with 8 

bytes left over for some housekeeping chores, 

A system of N buffer3 is se,t up, iith each buffer holding one, page. 

N Is a.progtam variable and is eApecLed to b^betwuen 32 and &- . cousistent 

with other core requirements. The linkpg:> pa|es arc moved in and out of the 

buffer areas as n'eeded \Iith a d^irectory kept of which are in core. 

In addition, the direc/ory keeps a record of the manner in which 

the pages are referenced. Those pages referenced least recently and/or 

. least frequently are first^to be moved out when a new page (a page not in 

core) is required to be m9ved in. Note this is neither a first-in-first-out 

i 

nor a random paging system but a ''least recent and least frequently 
referenced, first out" system, 

, The con;:ept here is that during a given time interval in the running 
of the program, certain pages will be referenced with high frequency and 
others less often. The particular set of pages so referenced will change 
with time, or with the demands of the program. The paging system is expected 
to adapt to these changes and to adjust in a statistical manner to the 



ERLC 



70 



66 



changing page requirements. It is *ai?o expected that the linkages will 
cluster so that there is a high probability that links will'poinr to others 
on the same page. While t(o systematic et forts to produce clustering arn made 
this particular application and the nnnntjr in which the links a»-e inlLially 
assigned will encourage clustering. However, it is .recognized that if 
extensive changing and reassignment of the linkages occurs, the clustering 
or correlation of the links within a page will d^^^ppear and become random 
or Incoherent. This could be corrected by a more sophisticated garbage 
collection and link assignment algorithms, but such techniques were not 
undertaken at this time. 

The paging algorithm works as follows: There are N buffers, labelled 
0 through N-1, each holding one page. Associated with each buffer is a pair 
of pbinters .designating a predecessor and a successor for that buffer plus 
one more pointer pair, not associated with any bulfer, used to designate the 
first and the last buffer. The N+1 predecessor-successor pointer pairs are 
labelled 0 through N , and all but the last have an associated buffer. 
Initially they are linked as in Figure 10a; 1 is the predecessor of 2, 
2 is the predecessor of 3,..., 1 Is the successor of 0, 2 is the successor 
of 1,.^.,. This is usuaHy referred to as a doubly linked circular list. 
The arrowed lines show the predecessor-successor relation. Note that N 
is the predecessor of 0 and successor of N - 1 in Fig. 10a. Fig. 10b is 
another pictorial representation showing the buffers as a simple ordered set 
with 0 at the top and N - 1 at the tail of the sequence. 

Now whenever a buffer is referenced, the pointers are changed so that 

the referenced buffer becomes the successor of N , th^^ pointers being 
/ 

reconniTtcd to slo\^ this shift. Thus, if buffer 2 is ref euvnced , the 



68 



pointers' will be changed to look like Fig, 10c. Considering the sucessor 
of N to be ^hP head of the list and the predecesspr of N to be the tail, 
the reconnect ion to Fig. 10c has promoted 2 to the head of the list while 
demoting by one position all thcs«^ tral were ahead of 2, without changing 
the position of the others; Fig. lOd is the other pictorial representation 
again showing the buffers as an ordered set with 2 at the top. One says 
that buffer 2 has been ^'popped to the top*' of the list. 

Note that nothing really "moved"; only three pointer pairs were 
broken and reconnected (three memory swappings or nine fetches and nine 
stores to reconnect the points). The pointers changed are Indicated v;ith 
an X in Fig. 10c. The algorithm in an Algol-like notation is as follov's; 



sn 

pj 

s[n] 

s[pj] 

p[j] 
p[sn] 

Pfsj] 



= s[nl; 

= p[j]; 

^ t J J 1 

- sn; 

= j; 

= sj; 
= n; 

= j; 
= pj; 



Here sn, pj , and sj are temporary storage, s[j] and p(j] is the successor 
and -predecessor the the j^^ buffer, n is the last or N^^ pointer pair, and 
j is the index of the buffer being promoted. 

Each time a buffer reference is made, the referenced buffer is 
popped- to the top of the list in the manner just illustrated. 

This Is the update procedure that keeps a record of which buffer 
or page was refeicuLcd la^L, which i>L:L,oiiu lasL, wulch third Isst, and 
so on. The least recently referenced buffer will be the predecessor of N; 
its name or location is in the predecessor position of theN^^ p()inter 
pair and is easy to find. 

73 



ERIC 




Now' the swapping algorithm is just that the page in the least recently 
referencf=*d buffer is swapped out when a new page, a page not now in core, 
is needed. The old page is rolled out (if it has been written on), the new 
page is given this buffer location, and the pointers are reconnected so that 
the buffer of the newly brought in page is at the head of the list. 

One final variation on this theme is added. Instead of having a 
buffer pop to the top on every reference, it may be moved up only sometimes. 
A pop to the top may take place on only every other, or every third reference. 
This will result in a probability distribution for the buffer positions in 
the swap out list. Those referenced' with higher frequency have a higher 
probability of being near the top and those referenced less frequently will 
probably be near the bottom. Swap out takes place from the bottom of the 
list so that those pages referenced least frequently and/or least recently 
have a higher chance of being rolled out when a new page ig needed, (It is 
assumed here that large numbers of page references are more or less random 
or at least have no systematic periodicities that might discriminate against 
certain frequently referenced pages being moved to the top of the list). 

Whether pop to the top takes place every third or every second or on 
every page reference is a program parameter and can be set as desired. What 
is opt^imum in this application would need to be determined by extensive 
^ experiments. Such experiments are far ^field from the real goals of tills 
project and will not likely be pursued at this time. However, preliminary 
analysis suggests the following: If the usual situation is that page 
references are highly clustered (that is, if many successive references to 
the same page are likely), then pop to the top every third or fourth or even 
tenth^ reference would be more efficient than promotion on every reference. 

■Er!c 74: 



70 



ERIC 



If, at the other extreme, the correlation is very low or nearly random .-"^d 
pages are seldom referenced more than once or twice during their lifetime in 
core (i.e., before they are swapped out again), the pop to the top on every 
reference vould be more efficient. One expects the situation here to be i;ome- 
where in the middle of these extremes so that promotion every second or 
third reference will be tried to start with, and performance of thi overall 
system is not expected to be very sensitive to this parameter. 

IX. STRUCTURE OF THE PROGRAMS 

The C language (D is a procedure or subroutine oriented language and 
well suited to modular and hierarchial programming techniques. Here will 
.be given some of the structural characteristics of the programs. 

The basic building unit is the procedure. A procedure is a more or 
less closed set of code that is called or Invoked as a unit. It can have 
within it calls on other procedures (including itself). Procedures 
communicate with each other both through parameters and global variables. 
(While it is considered that procedure communication through global 
variables is "dangerous", i.e./ an error-prone method of doing things, it 
* would be most difficult and awkward to avoid in, the C language.' The best 
that can be done is to keep global variables to a minimum.) 

The procedures are grouped into sets (called blocks) that perform 
particular tasks. Sets of these blocks are then b-iilt into programs that 
perform higher order tasks. Sets of programs can also be grouped together 
to perform yet higher order tasks. The program set is the top of the 
hierarchy in the C-Unix system on the I'D? 11 (2). 

Figure 11 is a block/dlagram for the module processing., program that 
loads the modules and casts them into their linked list structure form. 
Each block represents a collection of procedures used to perform the indicated 



4 * 



f 



71 



tasks, olobal variables are global only within a block and may not be 
referenced outside of that block. Procedures within a block may refe.ron-e 
each other or those In a lower block. , Thus, for example, the set manipulation 
procedures reference each other and thuse of link-paging and c7\ -file 
management, but link-paging and -^ -f ile management never reference each other 
nor the global variables associated with each others blocks. This hierarchy 
structure makes debugging and trouble shooting easier since the blocks can 
be checked out in order from bottom to top. 

(Note here that "proving correctness" of our programs is not being 
done in any sense, but rather only methods which produce fewer errors and 
make those errors that do occur easier to find are being adopted.) 

Even the procedures within each block have a hierarchy structure. 
Fig. 12 is a luttire diagVoin for the link-paging .oystcir. Dho:;ing uhich . 
procedures reference which within this block. Note that a procedure only 
references those below it in the diagram to which it is connected by a 
line . 

Figures 13 through 24 give the flow diagrams for the individual 
. procedures of the link-n^ging system. These are just the procedures 
indicated as "The Link-Paging Management Procedures" in the lower left 
box of the block diagram. Fig. 11, and in the lattice diagram, Fig. 12. 

Figures 25, 36, and 43 give the lattice diagrams for the'^-file 
management procedures, the hashing procedures, and the set manipulation 
procedures respectively, figures 26 through 33 give' the flow diagiaius. for 
^^'tile management, 37 through 42 give the flow diagrams for hashing, and 
44 through 60 are the flow diagrams for set manipulation. Fig. 61 ^gives 
the lattice diagram for the module processing procedures while Figures 
62 through 69 give the flow diaRrnms for each of those procedures. 



72 



Most of tha blocks in Fig. 11 will appear' repeatedly in the other program 
block diagrams since they are the basic units needed in all link, character 
and set manipulations or the programs. This is the reason for giving so 
much detail on thesje blocks here. For exar.ole, in Fig. 70 is given rhe 
block diagram for the interactive (set manipulation) program. One sees 
that it uses mostpf the same procedures used for module processing. 
Fig. 71 is the Idttice diagram for the interactive procedures, and Figures 
72 through 87 give the flow diagrams for each of the individual procedures. 



ERLC 



THE MODULE rROCESSING PROGRAM 



73 



! Module Processing 
Procedures and 
Main Pr^^s^am 
(rap.c) 



Set Manipulation 
I Procedures 
j ' (set.c) 



(hl.c) 



^The Link-Paging 



Management 




Utility 
Procedu^res 
(u.c) 



.The - File 
/ j Management 
i Procedures 
(cf.c) 



Fig. 11 



ERIC 



"8 



LATTICK DIAGR/NM OF ^ROCnDURCS OF 
LIN K-PAGING; SY STKM 



\ 



/ Newl ink : 



' Fetches a 
\ new link 

Getlink: 

Fetches 
desired link 



/ Reclaim: 

1 Reclaims a 
\ no longer 
used link - 



-V- 



Setpage: 



Sets pointer 
to desired 
page 

\ 



f 



V 



Setlink: 

Sets value 
of a link 



Linkinit: ^ 

Initializes 
link system 



/ Roll in: 

/ 

f Rolls out least used 
i page and rolls in 
\ needed page 



Shift: 

Promotes 
page last 
referred 



Remove h2 : 

Removes page 
name from 
directory 



Enterh2: 

Enters page 
name in 
directory 



/ 



/ 

/ Getlf: 



Read 512: 

Rends in 
one page 



Error: 



V Finds page 
in directory 



k Delivers error 
^ message 



\ 



Fig. 12 




en 



Get a 
never-be£ore-used 
link 

f 



\ Return 



Fig, 13 



76 



RECLAT M; 
RECLAIMS NO LON'GER NLEDED LINK 




Fig, U 



ERIC 



GETLTNK: 
FETCHES DESIRED LINK 




desired link [ 




Fetch components 

I 



of link 




Fig. 15 



\ 



78 



SCTLTNKj ■ 
SETS NEW VALUES or A LINK 



Start \ 

] 

\ set link J 



Poiat to 
desired link' 



Are ' 
page 6'' line 



Set link to 

appropriate 
value 



No 



( 



I Page or line out 
J ' of rnnee: 

I 

I .error message 




\ 



Mark write 
flag 




ERIC 



Fig. 16 



- SETPAGE: 
SETS PAGE POINTER TQ DHST-RED PAGE 




Yes 




No 



/ Return 



rv 

Yes " 





> 

Roll 




No . 1 


desired 



Return 



Promote page to 
^ top of 
Vollout list , 



Fig. 17 



ERIC 



80 



ROLLS OUT OLD PACK AND BRINGS TN NKWS 



/ Start 



\ rollin 

h 

".t--,->- 

Check page at 
bottom of 
rollout list 




Yes 



Remove old. 
page from . 



>1 



Promote new 
page to top of 
rollout list 









\ 




■-«^* - 1 

Rollout page 




at bottom of 






> 


list (bid page) 




Read in 
new page 



Enter new page 
in directory; 



l^et write 



Set write flag null 



To 
A 



Fig. 18 



/ - 

I Return ^ 



81 



ENTERH2: 

ENTER NEV; PAGE NWME IN DIRECTORY 




Find new page position 
in directory 




^nter new page 
in directcry and 
set page pointer 
to it 




Yes 



Program error; 



give 



j error nicccage 
j 



I Program 
error; 

Vjlalt: 



/ 



ERLC 



Fig. 19 



REM0VE]t2i 
REMOVES OLD PACE N/VME FROM Dn;::CTORY 



Enter 



\^removeh2 
\ 



T 



Find page 



name in 
direc tory 



V 



Was 
old page in 



\ 



vdirectorv 

? / 



Program error 

give 
error message 



/Program 



V Halt: 



Fig, 20 



83 



GETLF; 

FIND PAGE NAME OR POSITION IN DIRECTORY 



Start 



getlf / 




ERIC 




Set flag 
to not found 



Y 



/ Return 

V 



Yes 





Set flag 
\ ^ i 

to found 



V 



} ( 



Fig. 21 



SHIFT; 

PROMOTE PAGE LAST RE FERKriCLD TO _ TOr OF ROLL OUT LIST , 



Start 
shift 



Is 



page 



/ already at 



\ 



/ 



top 



Yes 



! Return 



[ 

Link in page 
j to top of list 



j Close links in gap ; 

J. 

; Return 



ERIC 



Fig. 22 



LTNKI NIT: 
INlTTALIZr.S "uNtCAr.E SYSTHM 



ERIC 



Start 



linkinit 
Open link files 



7 



Read in first n pages 
until buffers are full 
(n = number of buffers) 



Set all write 
flags to null 

.... 



Initialize new linkage : 
assignment counters 
(i.e., reclaim-] i.st counter ■ 
and f irst-never-uacd 



counter) 




Fig. 23 

90 



V 



Setup 
directory of 



j pages in core 



\ 



V 



\ 



! Return 



\ 



86 



RI^AD5X2.:. 

READS IN ONE PAGE (512 CHARACTI-RS) OF LINKS 



/ Start \ ' 
\read512 

y 

Read page into 
appropriate buffer 



V 



\ 



Was 
there a 
\read error , 



Yes 



No 



7 



• Return 



Fig: 24 





Give ! 

i 

read-error 

i 

message \ 










/ 



ERIC 



87 



iATTjq';_inACR.\M of 
-file" MAr:Ani:MKN"T BLor:':; (cf»c) 



Chareq: 
Check equality 
between strings, 
(core and flch) 



Readch: 
Gets character 
fr^n file flch 

(buffer) 



Stringeq: 
Check equality 
between strings, 
(file and file) 




\ 



\ 



Prstrng: 
Print string 
from file 
flch 



Stringpr : 
Print string 
from (any) 
file 



Pdchfl: 
Get single 
chnrfic t 
(any file) 



Charcad : 
Fills read 
buffer 



ERLC 



/ Writech: \ / Chinit: Flushch: 

/ Writes single \^ / Initialize; * Close 'out 

(character in / opens file, / character file 

file (buffer) i etc. \ 



Fig. 25 



r 



WRITKCjn 

WRITKS A CHARyXCflfR 'orfFIlJ- (BUFFER) 
Start 



Move character 
j to j^h element 
of buffer 



r 



/ c = 0 - ^ 

' ^'^d \. Yes 

character counter "V 



even 



c/ No 



Increment character 
count, put 'null 
in next position 



or 

buffer full 



Yes 



Empty buffer, 
reset counter j 



No 



Increment j 
page full 



Yes 



I 



Increment page 
counter; reset 
character counter 



/ 



No 



Return ^ 



/ 



rig. 26 



CHAREAD; 
FILLS READ BUFFER FROM FILE 



/ 

Start j 
\ chaf^ad 



Fill 



buffer ! 



there a file 
error 

7 . ' 
1 * 



r 



/ Return 



Yes 



Deliver 

error 
message 

/ Program 
error; 
Haiti 

\ 



Fig. 27 



I 



90 



RDCHFC ; 

RE.\DS A SINGLE CHARACTER FROM A FTLE (BUFFER) 



1 

1^ 



Start \ 
\ rdchfc 



Refill 
buffer 

7 



V 



No 



Yes 



Read new 
buffer, fulll 
set coimtt-rs 



Return 
",character 



Fig. p 



ERIC 



STRinCPR : 
PRINTS STRIMG FROM FIL E 



I Start 
^ stringpr 



vk._...^ 

Give I 
carriage 
return I 



S7 



/ 



Get a character 
from file 
(buffer) 




I No 
V 
1 

Increnn2int 
courv^crs 



Xes 



f \ 
Return 



V 



Time 
for carriage 
reUurn 

No 

Print ^ 
xiharacter 



Yes 



Vr. . 



Put out :- 

carriage return 



-nJ 



Fig. 29 



PRSTRIM G; 
PRINT STRING FROM FIISE FLCH 




4 

Fig. 30 



b7 



A 



STRNGEQ; 

. .'TEgTS FOR EQUALITY B ETWE EN TWO SS'RTNGS 
• IN (POSSIBLY DIFFKRE NT) FILES 




Initialize 
counters^ ' 



4 " Read character 
from each 
file 



i 




Equal , 
and not \ 




Return . true if 
two characters * 
are equal els^ 
false 




.Fig. 31 



CHAREQ; 

EfQUALITY TEST B&TCJEEN A STRING IN CORE 
• AND A STRING X)N FILE 



■p- 



^ / Start 
\ chareq 



.. r" 



Initialize 
counters 



I 



Get ^character 
. from file (buffer) 




Return true, if ' 
equal, false 
otherwise ^ 



1 . 



I Retuiiv 1 



Fig. 33 



ERIC 



BAILOON DIACRAM; 
HASH 1^^01^ ALGARITHM 
BLOCK* (hl.c) 



J -DiiUpchfe: 

Prints all 
; hashed strings 



HashOl: 
Main hashing 
algarlthm 



Flushbuf : 
Empty buffers 
writ^ counts 



/ 'Dump : 
■ / PrinVs 



•eacli, string 



\ 



I 



A 



\ 



/ Putout: 
/Writes string in 
^i*^ files, sets 
\ . printers 

\ 



Getname< \ 
Finds start of \ 
\ string in \ ' 
I file i 

J 

/ 9 



9 



Fig. 36 



FLUSIIBUF; 
EMPTY BUFFERS. WITE COUNTERS 



« 




1 



Write character 
counts 
• '(flushph)'' 



Write link 
count and 
empty jbuffers 
■ (f lushln) 




Fig. 37 



li.'4 



ERIC 



\ 



DMPCHFL; 

\yRItes out all the strings hashed 
into the character file 




Initiali'^e 
counter 




Write string and! 
all strings 
emanating from 
this position 



Fig. 38 



^ 

101 



DUMP: 

, WRITE OUT A PARTICULAR STRINCLAND . 
ALL STRINGS BLXNATING FROM TtL\T ONE 




Fetch 
location of 
string 




string 
here ^ ' 



1 



Yes 



. Pring 
string 



.Print all strings 
emanating 




No 



Return 



\ 



Fig. 39 



1^ 



ERJC . . 



HASHOl : 



MAIN HASHING ALGORITHM 



A 



' Start 1 
hashOl J 



Generate hash iijdex 
from character string 
(page and line) 




Get string 
location 





Get 


a 




new 


link 


0 








PUTAT: 

mUES STKING, SETS POINTERS TO IT 



I 




\ 



%g. 41 



GETNAME; 
FINDS START OF STRING IM 
CHARACTER FILE 




105 



■ • LATTICE -DIA GRAMS FOTl 
\ PR OCEDURES IN SET MANIPULATION 
. (BLOCK* (set.c) 



.\ ... 




• ^Procedures called in lower blocks .not shown. 

Fig. 43 



ERIC 



TEMPLIST; ^ 
CREATES A LITERAL (EXTERNAL) NAME FOR 
INTERNAL USE ONLY 



106 






Hash In 


1 




new name 


i 







\ 



Program \ 
error; j 





DeliveX- 






error 






message 





y 



V 



Fig. 44 



ERIC 



EXLISTSt 
INTERCHANGE NAMES OF TOO LISTS 



♦ 



Start \ 
exlist 




Get head of list 
foif first name 




Get head of list 
for second name ' 



t 



Put first 
list with 
second name 




Put second* 
list with 
first name 




Fig. 45 



CNTLIST; 
COUNTS THE ITEMS ON A LIST 




Return 



Fig. 47 



114 



PRIOTSET: 
PPTMT THE ELEMENTS OF A SET 




Give carriage 
return (^ine feed) 




\ 



PRNLIST; 
PRINTS A LIST 




Get first item 
of list 



V 



Get literal 
name of first item 



Print literal 
name of item 



T 



"l^rnlist" for all 
(3) branch lists 
emanating from this itea 




Fig. 49 



GARBCAL; 

COLLECTS NO lONGER NEEDED LISTS AND 
PUTS LINKS ON FREE LIST 



Start \ 
garbcal 







Get first 


link (Item) of 




LSt 







I 



Reclaim 
link 



"Garbcal" for 
all (3) branch lists 
emanating from this item 




Fig. 50 



117 



114 



DIFSET; . 
GIVES . DIFFERENCE OF TWO SETS 




(Difference) merge 
first list with 
second set 




Fig. 52 

119 




PRODSET; 
GIVES IHTERSECTION OF TWO SETS 




SUBLIST; 

' CHECK IF LIST IS A SUBLIST OF A NAMED LIST 




Get name 
to be checked 




No 



Yes* 



"Sublist" for all (3) 
l)ranch lists emanating 
from this item 



-!>— f Ret 



Return (0) 




ERIC 



Fig. 56 



123 



' • ADDSETt 
cfvE iroiON OF TWO NAMED SETS 



119 




Fetch list 
of first set 



V 



Add list to 
second set, 




Fig. 57 



ERIC 



124 



PDLIST! 



PUTS ITEM FROM (A) LIST ON OTHE R NAMED (C) LIST 
nKPFKniNG ON ANOTHER NAMED (B) LIST 




Get Item 
from (A) list 




Put 
on (C) 


Item 
list 


-A 








"Pdllst""f^r all 
(3) branch^ lists 
emanating from 
this Item 



^Return 




' Fig. 58 



ONLIST; 

CHECKS.jrQ'SEE IF ITEM IS ON NAMED LIST 





Get link 
of name 





Initialize 
tree search 













Get link; 
calculate next 
tree branch 







— <J 




Fig. 59 



122 



PUTITEM; 
PTACES AN ITEM ON NAKED LIST 




Get link 
of name 




Get link; 
calculate next 
tree branch ^ 




. 127 



123 



LATTICE DIAGRAM 
MODULE PROCESSING 
BLOCK* fap.c) 



c 




♦Procedures called in lower blocks not shown. 

Fig. 61 



124 



PUTFLD; ' 

HASHES IN CHARACTER STRINGS AND SETS LINKS 



\ 




Hash in 
field value 



Put field 
value on list 
of module 



Put module 
on list of 
field value 




Fig. 62 



ERIC 



t 



o 



CHREAD; 
' READS CHARACTER 
FROM FILE (BUFFBR> 




Read a 



125 



character 



\ 



V 



\ 

\ Return 



Fig. 63 

ERIC 



WITMOD: 

INITIALIZES MODULE FILE AND COINTERS 




Open "stlte.mod"; 
check for error 



Initialize 
counters y set In 
check characters; 
get last module number 



Return 



ERJC 



^31 



fig. 64 



127 



GETMODWt 
PROCESS MODULE NAME (NtJMBER) 




Get first 4 
characters 




Push down 
get next 
character 



Put In 
ronove 
blanks 



lead and 
convert integer 



Out of 
sequence 
message 





Put in 
and null 



Hash in j 
mod name j 



Return last 
character 




128 



\ 



\ 



GETTEXT; 
PROCESS FREE TEXT 




Get first A 
characters 




No 



Pushdown t get 
next 
character 



Get page and 
character location 
for text insertion 



' Wtite character 
J and pushflpvm 





Put text on 

list of module 

. 



\ 



Return last 
ct^afacter 





Write* dut"^ jcharactets; 
push down, get 
next character 




I ERIC 



■■i '1 

Fig. 66 

'\ 

i 



129 




134 



130 



G ETLASTt 

PROCESS LAST OR "P" FIELD — 



start / 
getlast 

i^:^....- 

Insert field 
name and " {" 



I Remove blanks, i 
; initialize search 




{ Pushdown, 
] get next 
' character 



Remove 
blank;; 



Insert 
and null 



y 



Record 
field value 



V 

Return last 
character 

9 ' -V ' ' 



Return 



Fig, 68 

135 



131 



MAIM 



PRIVER OR MAUI PROGRAM FOR MODULE PROCESSING 




r 



debug 
^lags j 



Initialize files 
character , 

link Yiaodtile 



Yes 




Tlush buffers, i 



module 

count pr inf count s ^ 

exceeded ' 



/ Exit 



\ 



/ 



/ ■ 



• i 

Process nodule 
name (number), get 
text, get "A" field 



Is 



/\ ^ the next 
\^fleldv' "D" 



Process 
field 



Are 
/ fields 
LJ>^ in order 



Yes 



A 



Yes 



Process last 
("D") field 



No 



Search for proper ' 
field name, give i . 
missing field ; • 
' message ; 



L i 



Fig. 69 



,ERIC 



THE INTERACTIVE (SET MANIPULATION PRQCaiAM 



1^2 



Interactive (Set 
Manipulation) ^ 
Procedures and Main 
(inter.c) 



The Link-Paging 
Management 
Procedures 
(Ip.c)^ 



Set Mani^fulatlon 

Procedures 
^ (Set/g) 



r 




Hashing 
Procedures 
(hl.c) 




Utility 
"Procedures 
(u.c) 



The C< -File 
Managemer^ ' 
Procedures 
(cf.c) 




Fig. 70 



ERIC 



137 



MAIN 
MAIN PROGRAM 
OF INTERACTIVE SET M/ttllPULATOR 



134 




Exit 



Flush 
buffers 



1 



Go to 
A 



I 



\ 



I Print timing; 
i get next 
{ request 



Go to 
B 



^ — ^ 


• Branch to Appropriate Operation \ 






GREAT 


COIWT 


POINT 


MAKE LIST NULL 


-ATTACH ~ 


DELETE 


REMOVE 


IS' MEMBER*^ 


DISPLAY 


ADD LIST 




RELATION 


"DO NOT INDERSTAND" 




Fig. 72 

ERIC 



tMAKE; 
SET LIST TO NULL 



135 



Start 
pmake 



T 



Get next 
word 





Here for 
"make" 



J. 



Yes 



bon't 
understand 



Return 



Don't 
understand 



y 



Make list 
the null 
list 



Return 



Fig. 73 



ERIC 



140 



PREMOVE; 
REMOVES SINGLE ITEM FROM LIST 



f Start 
\prensove 

\ 



I 



Get next 
string 



T 



V 



\ 



Here for 



136 



\ 

i 



\ 




Yes 



T7 



Set internal • 
name lists 
to null ! 



Remove item 
from list 



Print 
1 "no such..." 



Return 



Return 



Fig. 74 



FRir 



HPELETE: 
REMOVE LIST FROM ANOTHER 



Start 
pdelete 




Get next 
word 





Print 
"no such..." ' 



._i Return 



I Return 



Fig. 75 



14? 



139 



PRELATICN 5 
ppTPPMTTJFf; I^RLATION BETWEEN TWO SETS 




/ Here for 

\ "relation'^ _ j 



Get next 
word 



1 



<C "list" ..>\'^; understand. . 



\/ Yes 



Get string 



/ 



Return 




Subset 

7 



No 



Yes 



\ Print 
1 "subset" 



/ 

*Return 



Print 
I "no suth. 




Check for 
sub and 
super-set 



Superset 
No 



•X Yes 



I ' 

1 Count intersection 
j and print count 



, Return 



Print 
"superset' 



Return 



ERIC 



"Sets. Print 

' .-"identical" 





Fig. 77 



CHECK FOR Sf[ MEMlBERSHIP 

i \ 

.A 



Start 
s 



w ^ 



\ . * 



Here for\ 
••is" 



Get next 
w6rd 



. V 

\ 

■ ; *\ 

■ ''^^ 




^ i "Don * t underi^ tahd • . ♦ •• 



\ 



■\ .Print 



\ 



— bpo^ such* . .*\ ^ I 



Rfetum 



A' 



Print "^1 

..J 



yes \ 



Return 



ERIC 



Fig. 78 

115 



^PDiISi>LAY; 

dispi ayVTtem or .list 

\ 



\ 




No 



Get next 
word 




Here for 
"display" 



"T 
1 

-J 



Print item 
in P register 



Return 




No 



Print list of 
items pointed by 
P register 



— J. — 

t 

o 

^^Return • 



^on't understand..." 



Return 



Fig. 81 



148 



144 



PPOINT; 

SKi POKTER TO APPROPRIATE SET 




Set pointers 
and flag 




Fig. 82 



ERIC 



145 



PCREATE; 
INTRODUCES NEW NAME 




Here for 
"create" 



Get string 



I 





ERIC 



Fig. 83 



I 



146 



GETWORD; 
GETS THE NEXT WORD 




• 




! 

Read word ; - ' j 
63 characfter. limit; ! 
no embedded blanks ! 

i 




9 



I 



Null 
terminate 



ERIC 



Fig. 84 



151 



14/ 



UTINIT; 

f INITIALIZE INTERACTIVE MODE 




Initialize character file; 
Initialize links file 



7 



r 







Print brief 
instructions 


1 


X 


T 






Request timing 
- decision 




7 






Get first word 










Set flag 






Fig. 85 



:7 



ERIC 



GETSTR; 

GETS A STRING WITH (POSSIBLY) EMB£DDED BLANKS 




Remove leading 
blanks 



J 



149 



Read string; 
63 character limit 
embedded blanks allowed 



j Remove trailing 
blanks 



Null i 




Fig. 87 

ERIC 



V 



150 



XJ REFERENCES 

1. C Reference Manual * Western Electric Company (Bell Telephone 

Laboratories) • 

2. Thompson, ,K., and D. M. Ritchie. Unix Programmer's Manual t 

Fourth Edition, Bell Telephone Laboratories, Inc., 1973. 

3. Zunde, Pranas. Scientific and Technical Information Transfer 

for Education (STITE). Atlanta, Georgia^ School of Information 
and Computer Science, Georgia Institute of Techhology, December, 
1973.; Progress Report, NSF Grant No. GN*36114. 

4. Zunde, Pralnas. Scientific and Technical Information Transfer 

for Education (STITE). Atlanta, Georgia, School, of Information 
and Computer Science, Georgia Institute of Technolo^, June, 
197A'. Progress Report, NSF Grant No. GS-36114. 




1 



APPENDIX 



152 




A. LETTER TO USERS OF THE GEORGIA 
INFORMATION DISSEMINATION CENTER (GIDC) 



SCHOOL OF LSFORMATIOS AND COMPUTER SCIESCE I i-fn^} 694^Jn: /ATLANTA GEORGIA 3 OSS 2 



November 26, 1974 



Dear Colleague: 



Most scientific and technical information resources that are avail- 
able through computerized services (such as the service provided by the 
Georgia Information Dissemination Center of the University of Georgia) 
are utilized relatively little for educational purposes. Under a National 
Science Foundation grant to the Georgia Institute of Technology, we are 
currently studying the reasons why these information resources are so 
little used for educational purposes, and trying to determine ways and 
means of increasing their utilization for the improvement of instruction- 
In this study, we need to draw on the experience and expertise of people 
such as you, who have actually worked with systems of this type. We hope 
very much that you will be so kind as to help us. 

Specifically, we are asking you and other users of the Georgia Infor- 
mation Dissemination Center to complete the attached questionnaire at your 
earliest convenience and return it to us in the enclosed pre-addressed 
envelope. Please feel free to supplement the questions on the form by 
adding your own comments and observations about your practices and experiences. 
Your assistance is^ssential to the success of this inquiry, from which the 
whole community oi^ educators will eventually benefit. 



Looking forwar^*^TTf your response, we thank you in advance for your kind 
cooperation. 




V 



Pranas Zunde 

y 

Professor 

Information & Computer Science 



PZ:t8s 



153 



, B. QUESTIONNAIRE TO USERS OF THE 
GEORGIA INFORMATION DISSEMINATION CENTER (GIDC) 



1, Please Indicate your position by checking the appropriate line, 

Undergraduate student 

Graduate student 

Instructor or lecturer 

Assistant Professor 

Associate Professor 

Professor 

Other academic (Please specify) 

Other non-academic (Please specify) 



2. Have you used the information that you have received from the Georgia 
Information Dissemination Center (GIDC) for instructional purposes 
of any kind? 

Yes (Please go to question 30 

No (Please go to question 7.) 



3. For what specific instructional or educational task(s) have you been 
able to utilize the computer-based bibliographic retrieval services 
of GIDC services? 

Development of a new course 

Updating of an existing ^ourse 

Preparation of illustrative examples 

Selection of case studies 

Compilation of bibliographies or reading lists 

Collection of data 

Current awareness m subject area of a course 

Preparation of quizzes, tests, and other exercises 

Other (Please specify) 



4. In what subject area did you utilize this information? (For example: 
biology, chemistr^r, mathematics, physics, etc.) 



ERIC 



154 



5. For what types of courses have you used this information? 
(Please check as many as applicable.) 



tecture 
Seminar 

Special project 

Laboratory . 

Other (Please specify) 



6. Were titles and abstracts which were provided to you by the service 
sufficient for your purposes, or was it necessary for you to obtain 
full-text documents? 

Titles and abstracts were sufficient 

Titles and abstracts were not sufficient 



7. What improvements to the system would enhance utilization for instructional 
purposes? 

Easier access to the system 

Shorter waiting times for information delivery 

More descriptive abstracts 

Browsing capability 

Interactive system for query or profile formulation 



8. What other suggestions woul^ you have for the improvement of the system? 



9. Eventually we night want to contact you for further comment and/or 
clarification. If that is agreeable to you, please provide the 
following information: 



ERLC 



N^me: 



Telephone: 



